CN111785321B - DNA binding residue prediction method based on deep convolutional neural network - Google Patents

DNA binding residue prediction method based on deep convolutional neural network Download PDF

Info

Publication number
CN111785321B
CN111785321B CN202010533489.4A CN202010533489A CN111785321B CN 111785321 B CN111785321 B CN 111785321B CN 202010533489 A CN202010533489 A CN 202010533489A CN 111785321 B CN111785321 B CN 111785321B
Authority
CN
China
Prior art keywords
residue
protein sequence
layer
convolutional neural
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010533489.4A
Other languages
Chinese (zh)
Other versions
CN111785321A (en
Inventor
胡俊
白岩松
樊学强
郑琳琳
张贵军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Zhaoji Biotechnology Co ltd
Shenzhen Xinrui Gene Technology Co ltd
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202010533489.4A priority Critical patent/CN111785321B/en
Publication of CN111785321A publication Critical patent/CN111785321A/en
Application granted granted Critical
Publication of CN111785321B publication Critical patent/CN111785321B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于深度卷积神经网络的DNA绑定残基预测方法,首先,根据输入的残基数为L待进行配体绑定残基预测的蛋白质序列信息,使用psi‑blast程序和PSSpred程序获取矩阵PSSM和PSS;然后,将两个矩阵组合为一个特征矩阵F;其次,我们将蛋白质序列处理成残基样本;再次,搭建深度卷积神经网络,利用已知绑定残基的蛋白质序列构建数据集,并将数据集划分为M组数据子集,利用这十组数据子集训练出M个网络模型;最后,将待进行预测的蛋白质序列处理成残基样本,并输入到被训练过的M个网络模型中,综合这M个模型的预测结果,预测蛋白质序列中的残基是否为绑定残基。本发明计算代价小、预测精度高。

Figure 202010533489

A method for predicting DNA binding residues based on a deep convolutional neural network. First, according to the input residue number of L, the protein sequence information of the ligand binding residues to be predicted is obtained by using the psi-blast program and the PSSpred program. matrices PSSM and PSS; then, combine the two matrices into a feature matrix F; secondly, we process protein sequences into residue samples; thirdly, build a deep convolutional neural network using protein sequences with known binding residues to construct The data set is divided into M groups of data subsets, and M network models are trained by using these ten groups of data subsets; finally, the protein sequences to be predicted are processed into residue samples and input to the trained In the M network models of , the prediction results of these M models are combined to predict whether the residues in the protein sequence are binding residues. The invention has low calculation cost and high prediction accuracy.

Figure 202010533489

Description

DNA binding residue prediction method based on deep convolutional neural network
Technical Field
The invention relates to the fields of bioinformatics, pattern recognition and computer application, in particular to a DNA binding residue prediction method based on a deep convolutional neural network.
Background
Protein-ligand interactions are ubiquitous and indispensable in life processes, and play a very important role in recognition and signaling of biomolecules. The DNA molecule belongs to one of ligand molecules, accurately identifies the binding residue of the DNA molecule in a protein sequence, is beneficial to understanding the function of the protein, analyzing the interaction mechanism between the protein and the DNA molecule and designing a drug target protein, and has important biological significance.
Investigations have found that many methods for predicting DNA binding residues in protein sequences have been proposed, such as: DISPLAR (Tjong H, Zhou H. an acid method for predicting DNA-binding sites on proteins surface [ J ]. Nucleic Acids Research,2007,35(5):1465-1477. Tjong H et al. A method for accurately predicting DNA binding residues on protein surface [ J ]. Nucleic Acids Research,2007,35 (5):1465-1477), DELIA (Xia C, Pan X, Shen H, et al. protein-binding residues) biological binding residues of protein binding sites, sequence and structure data [ J ]. biological information, protein formation, i.e.. Xia C, etc. by improving the binding properties of protein through the mixed depth of sequence and structure data [ J ]. biological information prediction of protein binding residues [ N, C, protein J ]. prediction of protein binding sites, protein J ]. protein binding sites, protein binding sites, protein binding sites, protein binding sites, protein binding sites, protein sites, 2016,32(12): 121-: zeng H et al. prediction of DNA Protein Binding residues based on convolutional neural networks [ J ]. bioinformatics,2016,32 (12)), ENSEMBLE-CNN (Zhang Y, Qiao S, Ji S, et al. prediction of DNA Binding Sites in Protein Sequences by an enzyme deletion leaving Method [ C ]. International reference on interaction computing,2018: 301-: zhang Y et al, predicting DNA binding sites [ C ] in protein sequences by integrated deep learning methods, International Intelligent computing conference, 2018: 301-. Although the existing method can be used for predicting DNA binding residues in a protein sequence, a large amount of experimental data and a machine learning algorithm are generally used, so that the cost is high, and meanwhile, because noise information in a training set is not paid enough attention, the prediction accuracy cannot be guaranteed to be optimal, and needs to be further improved.
In conclusion, the existing prediction method of the DNA binding residues has a great gap from the requirement of practical application in the aspects of calculation cost and prediction precision, and needs to be improved urgently.
Disclosure of Invention
In order to overcome the defects of the existing DNA binding residue prediction method in two aspects of calculation cost and prediction precision, the invention provides a DNA binding residue prediction method based on a deep convolutional neural network, which is low in calculation cost and high in prediction precision.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for DNA-binding residue prediction based on deep convolutional neural network, the method comprising the steps of:
1) inputting a protein sequence S with the residue number L and to be subjected to DNA binding residue prediction;
2) for protein sequence S, a psi-blast (https:// toolkit. tuebingen. mpg. de/tools/psiblst) program was used to search protein sequence database swissprot (https:// ftp. ncbi. nlm. nih. gov/blast/db/FASTA /) to generate a location-specific scoring matrix of size L × 20, denoted PSSM;
3) for the protein sequence S, a PSSpred (https:// zhanglab. ccmb. med. umich. edu/PSSpred) program is used for searching a protein sequence database nr (https:// ftp. ncbi. nlm. nih. gov/blast/db/FASTA/nr) to generate a protein secondary structure matrix with the size of L multiplied by 3, and the protein secondary structure matrix is marked as PSS;
4) combining the two-dimensional matrixes obtained in the steps 2) and 3) into an L multiplied by 23 characteristic matrix, and recording the characteristic matrix as F;
5) adding 8 rows of 0 data before and after F, starting from the 9 th row of F and ending from the L-9 th row of F, taking the residue corresponding to the middle row as a prediction target, and taking the 8 rows of data adjacent to the front row and the back row as a feature matrix of the residue;
6) constructing a deep convolutional neural network to predict DNA binding residues of a protein sequence S, wherein the network comprises eight layers, the first seven layers are convolutional layers, the last layer is a fully-connected layer, each convolutional layer comprises a two-dimensional convolutional layer, a normalization layer and a pooling layer, the output of each layer is used as the input of the next layer, and the fully-connected layer uses a sigmoid activation function to enable the output value of the convolutional layer to be in the range of (0, 1);
7) generating residue samples by using a protein sequence of known binding residues through steps 2) -5), repeating the method to construct a training set, dividing the training set into M groups of training subsets, wherein residue positive samples in each group of training subsets comprise all positive samples in the training set, and randomly adding negative samples to each group of training subsets according to a positive-negative sample ratio of 1: 2;
8) using M groups of training subsets in 7) to train the deep convolutional neural network built in 6), wherein each group of training adopts two-class cross entropy loss functions to adjust parameters in the network, and M deep convolutional neural network models are obtained in total, and the two-class cross entropy loss functions are recorded as:
Figure GDA0003308795140000031
u represents the true tag of the residue to be determined in the protein sequence,
Figure GDA0003308795140000032
the predicted output value of the network model is represented, and Y represents the difference between the predicted output and the real label;
9) inputting residue samples generated by a protein sequence S into M models obtained in 8), setting an output probability threshold value as threshold for each model, and when the position of the output value larger than the threshold is a binding residue predicted by the model, predicting each residue sample in S through M models to generate M prediction results, wherein most prediction conditions in the M prediction results are final prediction results.
The technical conception of the invention is as follows: firstly, obtaining matrixes PSSM and PSS by using a psi-blast program and a PSSpred program according to protein sequence information with input residue number L and to-be-subjected ligand binding residue prediction; then, combining the two matrixes into a characteristic matrix F; secondly, we processed the protein sequence into residue samples; thirdly, building a deep convolutional neural network, building a data set by utilizing the protein sequence of the known binding residues, dividing the data set into ten groups of data subsets, and training ten network models by utilizing the ten groups of data subsets; and finally, processing the protein sequence to be predicted into residue samples, inputting the residue samples into ten trained network models, and predicting whether residues in the protein sequence are binding residues or not by integrating the prediction results of the ten models.
The beneficial effects of the invention are as follows: on one hand, starting from a characteristic matrix of sequence information, a protein sequence is processed into a residue sample, and a deep convolution network model is built, so that preparation is made for improving prediction accuracy; on the other hand, ten data subsets are constructed and used for training ten network models, and the prediction results of the ten network models are integrated, so that the prediction efficiency and accuracy of the DNA binding residues are further improved.
Drawings
FIG. 1 is a schematic diagram of a deep convolutional neural network-based DNA binding residue prediction method.
FIG. 2 shows the result of DNA binding residue prediction of protein sequence 1X3C using a deep convolutional neural network-based prediction method.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 and 2, a DNA binding residue prediction method based on a deep convolutional neural network includes the following steps:
1) inputting a protein sequence S with the residue number L and to be subjected to DNA binding residue prediction;
2) for protein sequence S, a psi-blast (https:// toolkit. tuebingen. mpg. de/tools/psiblst) program was used to search protein sequence database swissprot (https:// ftp. ncbi. nlm. nih. gov/blast/db/FASTA /) to generate a location-specific scoring matrix of size L × 20, denoted PSSM;
3) for the protein sequence S, a PSSpred (https:// zhanglab. ccmb. med. umich. edu/PSSpred) program is used for searching a protein sequence database nr (https:// ftp. ncbi. nlm. nih. gov/blast/db/FASTA/nr) to generate a protein secondary structure matrix with the size of L multiplied by 3, and the protein secondary structure matrix is marked as PSS;
4) combining the two-dimensional matrixes obtained in the steps 2) and 3) into an L multiplied by 23 characteristic matrix, and recording the characteristic matrix as F;
5) adding 8 rows of 0 data before and after F, starting from the 9 th row of F and ending from the L-9 th row of F, taking the residue corresponding to the middle row as a prediction target, and taking the 8 rows of data adjacent to the front row and the back row as a feature matrix of the residue;
6) constructing a deep convolutional neural network to predict DNA binding residues of a protein sequence S, wherein the network comprises eight layers, the first seven layers are convolutional layers, the last layer is a fully-connected layer, each convolutional layer comprises a two-dimensional convolutional layer, a normalization layer and a pooling layer, the output of each layer is used as the input of the next layer, and the fully-connected layer uses a sigmoid activation function to enable the output value of the convolutional layer to be in the range of (0, 1);
7) generating residue samples by using a protein sequence of known binding residues through steps 2) -5), repeating the method to construct a training set, dividing the training set into M (taking M as 10) groups of training subsets, wherein residue positive samples in each group of training subsets comprise all positive samples in the training set, and randomly adding negative samples to each group of training subsets according to a positive-negative sample ratio of 1: 2;
8) using M groups of training subsets in 7) to train the deep convolutional neural network built in 6), wherein each group of training adopts two-class cross entropy loss functions to adjust parameters in the network, and M deep convolutional neural network models are obtained in total, and the two-class cross entropy loss functions are recorded as:
Figure GDA0003308795140000041
u represents the true tag of the residue to be determined in the protein sequence,
Figure GDA0003308795140000042
the predicted output value of the network model is represented, and Y represents the difference between the predicted output and the real label;
9) inputting residue samples generated by a protein sequence S into M models obtained in 8), setting an output probability threshold value as threshold for each model, and when the position of the output value larger than the threshold is a binding residue predicted by the model, predicting each residue sample in S through M models to generate M prediction results, wherein most prediction conditions in the M prediction results are final prediction results.
In this embodiment, the DNA binding residue prediction of the protein sequence 1X3C is taken as an example, and a DNA binding residue prediction method based on a deep convolutional neural network includes the following steps:
1) inputting a protein 1X3C with 73 residues to be subjected to DNA binding residue prediction, and recording the protein as S;
2) for protein sequence S, a psi-blast (https:// toolkit. tuebingen. mpg. de/tools/psiblst) program was used to search protein sequence database swissprot (https:// ftp. ncbi. nlm. nih. gov/blast/db/FASTA /) to generate a position-specific scoring matrix with a size of 73X 20, denoted PSSM;
3) for the protein sequence S, a PSSpred (https:// zhanglab. ccmb. med. umich. edu/PSSpred) program is used for searching a protein sequence database nr (https:// ftp. ncbi. nlm. nih. gov/blast/db/FASTA/nr) to generate a protein secondary structure matrix with the size of 73 x3, and the protein secondary structure matrix is marked as PSS;
4) combining the two-dimensional matrixes obtained in the steps 2) and 3) into a characteristic matrix of 73 multiplied by 23, and recording the characteristic matrix as F;
5) adding 8 rows of 0 data before and after F, starting from the 9 th row of F and ending at the 64 th row of F, taking the residue corresponding to the middle row as a prediction target, and taking the 8 rows of data adjacent to the front row and the back row as a feature matrix of the residue;
6) constructing a deep convolutional neural network to predict DNA binding residues of a protein sequence S, wherein the network comprises eight layers, the first seven layers are convolutional layers, the last layer is a fully-connected layer, each convolutional layer comprises a two-dimensional convolutional layer, a normalization layer and a pooling layer, the output of each layer is used as the input of the next layer, and the fully-connected layer uses a sigmoid activation function to enable the output value of the convolutional layer to be in the range of (0, 1);
7) generating residue samples by using a protein sequence of known binding residues through steps 2) -5), repeating the method to construct a training set, dividing the training set into ten groups of training subsets, wherein residue positive samples in each group of training subsets comprise all positive samples in the training set, and randomly adding negative samples to each group of training subsets according to a positive-negative sample ratio of 1: 2;
8) using M groups of training subsets in 7) to train the deep convolutional neural network built in 6), wherein each group of training adopts a two-class cross entropy loss function to adjust parameters in the network, so as to obtain ten deep convolutional neural network models in total, and the two-class cross entropy loss function is recorded as:
Figure GDA0003308795140000051
u represents the true tag of the residue to be determined in the protein sequence,
Figure GDA0003308795140000052
the predicted output value of the network model is represented, and Y represents the difference between the predicted output and the real label;
9) inputting residue samples generated by a protein sequence S into ten models obtained in step 8), setting an output probability threshold value as threshold for each model, and when the position of the output value greater than the threshold is a binding residue predicted by the model, predicting each residue sample in the S through the ten models to generate ten prediction results, wherein most prediction conditions in the ten prediction results are final prediction results.
The above description is the prediction result obtained by the present invention using the prediction of DNA binding residues of protein sequence 1X3C as an example, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.

Claims (1)

1.一种基于深度卷积神经网络的DNA绑定残基预测方法,其特征在于,所述预测方法包括以下步骤:1. a DNA binding residue prediction method based on deep convolutional neural network, is characterized in that, described prediction method comprises the following steps: 1)输入一个残基数为L的待进行DNA绑定残基预测的蛋白质序列S;1) Input a protein sequence S whose residue number is L to be predicted by DNA binding residues; 2)对蛋白质序列S,使用psi-blast程序搜索蛋白质序列数据库swissprot生成一个大小为L×20的位置特异性评分矩阵,记作PSSM;2) For the protein sequence S, use the psi-blast program to search the protein sequence database swissprot to generate a position-specific scoring matrix of size L×20, denoted as PSSM; 3)对蛋白质序列S,使用PSSpred程序搜索蛋白质序列数据库nr生成一个大小为L×3的蛋白质二级结构矩阵,记作PSS;3) For the protein sequence S, use the PSSpred program to search the protein sequence database nr to generate a protein secondary structure matrix of size L×3, denoted as PSS; 4)将步骤2)、3)中获得的二维矩阵组合为一个L×23的特征矩阵,记作F;4) combine the two-dimensional matrix obtained in steps 2), 3) into a characteristic matrix of L × 23, denoted as F; 5)将F的前后加上8行0数据,从F的第9行开始,到F的第L-9行结束,将中间一行所对应的残基作为预测目标,前后相邻的8行数据作为该残基的特征矩阵;5) Add 8 rows of 0 data before and after F, starting from the 9th row of F and ending at the L-9th row of F, taking the residue corresponding to the middle row as the prediction target, and the adjacent 8 rows of data. as the feature matrix of the residue; 6)搭建深度卷积神经网络预测蛋白质序列S的DNA绑定残基,该网络共有八层,前七层为卷积层,最后一层为全连接层,每一个卷积层中又包含一个二维卷积层、一个归一化层和一个池化层,每一层的输出作为下一层的输入,全连接层使用sigmoid激活函数使卷积层的输出值在(0,1)范围内;6) Build a deep convolutional neural network to predict the DNA binding residues of the protein sequence S. The network has eight layers, the first seven layers are convolutional layers, the last layer is a fully connected layer, and each convolutional layer contains a Two-dimensional convolutional layer, a normalization layer and a pooling layer, the output of each layer is used as the input of the next layer, the fully connected layer uses the sigmoid activation function to make the output value of the convolutional layer in the (0,1) range Inside; 7)使用已知绑定残基的蛋白质序列经过步骤2)-5)生成残基样本,重复此方式构建训练集,将该训练集划分为M组训练子集,每一组训练子集中的残基正样本包含训练集中全部正样本,以正负样本比例1:2为每一组训练子集随机添加负样本;7) Use the protein sequence of known binding residues to generate residue samples through steps 2)-5), repeat this method to construct a training set, and divide the training set into M groups of training subsets. Residual positive samples include all positive samples in the training set, and negative samples are randomly added to each group of training subsets with a ratio of positive and negative samples of 1:2; 8)使用7)中的M组训练子集训练6)中搭建的深度卷积神经网络,每一组训练都采用二分类交叉熵损失函数调整网络中的参数,共获取M个深度卷积神经网络模型,二分类交叉熵损失函数记作:8) Use the M groups of training subsets in 7) to train the deep convolutional neural network built in 6), each group of training uses the binary cross-entropy loss function to adjust the parameters in the network, and obtains a total of M deep convolutional neural networks The network model, the two-category cross-entropy loss function is written as:
Figure FDA0003308795130000011
Figure FDA0003308795130000011
u表示蛋白质序列中待测残基的真实标签,
Figure FDA0003308795130000012
表示网络模型的预测输出值,Y表征预测输出与真实标签的差距;
u denotes the true label of the residue to be detected in the protein sequence,
Figure FDA0003308795130000012
Represents the predicted output value of the network model, and Y represents the gap between the predicted output and the true label;
9)将蛋白质序列S生成的残基样本输入到8)中获取的M个模型中,每一个模型设定输出概率阈值为threshold,当输出的值中大于threshold的位置即为模型预测的绑定残基,S中每一个残基样本经过M个模型的预测,产生M个预测结果,该M个预测结果中多数预测情况即为最终的预测结果。9) Input the residue samples generated by the protein sequence S into the M models obtained in 8), each model sets the output probability threshold as threshold, and the position of the output value greater than the threshold is the model predicted binding Residue, each residue sample in S is predicted by M models to generate M prediction results, and most of the predictions in the M prediction results are the final prediction results.
CN202010533489.4A 2020-06-12 2020-06-12 DNA binding residue prediction method based on deep convolutional neural network Active CN111785321B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010533489.4A CN111785321B (en) 2020-06-12 2020-06-12 DNA binding residue prediction method based on deep convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010533489.4A CN111785321B (en) 2020-06-12 2020-06-12 DNA binding residue prediction method based on deep convolutional neural network

Publications (2)

Publication Number Publication Date
CN111785321A CN111785321A (en) 2020-10-16
CN111785321B true CN111785321B (en) 2022-04-05

Family

ID=72756179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010533489.4A Active CN111785321B (en) 2020-06-12 2020-06-12 DNA binding residue prediction method based on deep convolutional neural network

Country Status (1)

Country Link
CN (1) CN111785321B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112466392B (en) * 2020-11-12 2024-03-22 浙江工业大学 ATP binding residue prediction method based on deep convolutional network
CN112365921B (en) * 2020-11-17 2022-07-15 浙江工业大学 A protein secondary structure prediction method based on long and short-term memory network
CN113257342B (en) * 2021-04-09 2024-05-07 浙江工业大学 Protein interaction site prediction method based on residue position characteristics
CN113096733B (en) * 2021-05-11 2022-09-30 同济大学 Die body mining method based on sequence and shape information deep fusion
CN113851192B (en) * 2021-09-15 2023-06-30 安庆师范大学 Training method and device for amino acid one-dimensional attribute prediction model and attribute prediction method
CN114512188B (en) * 2022-03-20 2024-04-05 湖南大学 DNA binding protein recognition method based on improved protein sequence position specificity matrix

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992079A (en) * 2015-06-29 2015-10-21 南京理工大学 Sampling learning based protein-ligand binding site prediction method
WO2018175986A1 (en) * 2017-03-23 2018-09-27 Rutgers, The State University Of New Jersey Systems and methods for modeling a protein parameter for understanding protein interactions and generating an energy map
CN111063389A (en) * 2019-12-04 2020-04-24 浙江工业大学 A Ligand Binding Residue Prediction Method Based on Deep Convolutional Neural Networks
CN111081311A (en) * 2019-12-26 2020-04-28 青岛科技大学 Prediction of protein lysine malonylation sites based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2820048B1 (en) * 2012-02-29 2019-04-17 Gilead Biologics, Inc. Antibodies to matrix metalloproteinase 9
CN107478754A (en) * 2016-06-07 2017-12-15 复旦大学 A kind of pre-treating method for detecting Residues in Milk aminoglycoside antibiotics
CN110689920B (en) * 2019-09-18 2022-02-11 上海交通大学 A protein-ligand binding site prediction method based on deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992079A (en) * 2015-06-29 2015-10-21 南京理工大学 Sampling learning based protein-ligand binding site prediction method
WO2018175986A1 (en) * 2017-03-23 2018-09-27 Rutgers, The State University Of New Jersey Systems and methods for modeling a protein parameter for understanding protein interactions and generating an energy map
CN111063389A (en) * 2019-12-04 2020-04-24 浙江工业大学 A Ligand Binding Residue Prediction Method Based on Deep Convolutional Neural Networks
CN111081311A (en) * 2019-12-26 2020-04-28 青岛科技大学 Prediction of protein lysine malonylation sites based on deep learning

Also Published As

Publication number Publication date
CN111785321A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
CN111785321B (en) DNA binding residue prediction method based on deep convolutional neural network
CN116417093A (en) A Drug-Target Interaction Prediction Method Combining Transformer and Graph Neural Network
Zheng et al. Emerging deep learning methods for single-cell RNA-seq data analysis
CN111063389A (en) A Ligand Binding Residue Prediction Method Based on Deep Convolutional Neural Networks
CN114783526B (en) Deep unsupervised single-cell clustering method based on Gaussian mixture graph variational autoencoder
CN116386729A (en) scRNA-seq data dimension reduction method based on graph neural network
CN114091603A (en) A spatial transcriptome cell clustering and analysis method
CN112951328B (en) miRNA-gene relationship prediction method and system based on deep learning heterogeneous information network
Zaki et al. Identifying protein complexes in protein-protein interaction data using graph convolutional network
Suo et al. Application of clustering analysis in brain gene data based on deep learning
Chen et al. Learning and interpreting the gene regulatory grammar in a deep learning framework
Li et al. Attention-based deep clustering method for scRNA-seq cell type identification
Jiang et al. Protein-protein interaction sites prediction using batch normalization based CNNs and oversampling method borderline-SMOTE
CN112149885B (en) Ligand binding residue prediction method based on sequence template
Termritthikun et al. Evolutionary neural architecture search based on efficient CNN models population for image classification
CN114512188B (en) DNA binding protein recognition method based on improved protein sequence position specificity matrix
Cai et al. Realize generative yet complete latent representation for incomplete multi-view learning
Ko et al. A deep learning adversarial autoencoder with dynamic batching displays high performance in denoising and ordering scRNA-seq data
CN112071362B (en) Method for detecting protein complex fusing global and local topological structures
CN117409968B (en) Hierarchical attention-based cancer dynamic survival analysis method and system
Bilen et al. A new hybrid and ensemble gene selection approach with an enhanced genetic algorithm for classification of microarray gene expression values on leukemia cancer
Yang et al. RNDEtree: regulatory network with differential equation based on flexible neural tree with novel criterion function
CN115862767A (en) anti-LRRK 2 small molecule drug prediction and screening method based on graph learning
CN116343908A (en) Method, medium and device for predicting protein coding region by fusing DNA shape characteristics
John et al. CNN-BLSTM based deep learning framework for eukaryotic kinome classification: An explainability based approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221109

Address after: D1101, Building 4, Software Industry Base, No. 19, 17, 18, Haitian 1st Road, Binhai Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong, 518000

Patentee after: Shenzhen Xinrui Gene Technology Co.,Ltd.

Address before: N2248, Floor 3, Xingguang Yingjing, No. 117, Shuiyin Road, Yuexiu District, Guangzhou, Guangdong 510,000

Patentee before: GUANGZHOU ZHAOJI BIOTECHNOLOGY CO.,LTD.

Effective date of registration: 20221109

Address after: N2248, Floor 3, Xingguang Yingjing, No. 117, Shuiyin Road, Yuexiu District, Guangzhou, Guangdong 510,000

Patentee after: GUANGZHOU ZHAOJI BIOTECHNOLOGY CO.,LTD.

Address before: 310014 No. 18 Chao Wang Road, Xiacheng District, Zhejiang, Hangzhou

Patentee before: JIANG University OF TECHNOLOGY

TR01 Transfer of patent right