CN115273968B - Quality evaluation method and device for protein prediction three-dimensional structure - Google Patents

Quality evaluation method and device for protein prediction three-dimensional structure Download PDF

Info

Publication number
CN115273968B
CN115273968B CN202210754951.2A CN202210754951A CN115273968B CN 115273968 B CN115273968 B CN 115273968B CN 202210754951 A CN202210754951 A CN 202210754951A CN 115273968 B CN115273968 B CN 115273968B
Authority
CN
China
Prior art keywords
prediction
sequence
predicted
amino acid
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210754951.2A
Other languages
Chinese (zh)
Other versions
CN115273968A (en
Inventor
管佳威
张闻瀚
金慧玲
王浩博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Liwen Institute Biotechnology Co ltd
Original Assignee
Hangzhou Liwen Institute Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Liwen Institute Biotechnology Co ltd filed Critical Hangzhou Liwen Institute Biotechnology Co ltd
Priority to CN202210754951.2A priority Critical patent/CN115273968B/en
Publication of CN115273968A publication Critical patent/CN115273968A/en
Application granted granted Critical
Publication of CN115273968B publication Critical patent/CN115273968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • G16B5/20Probabilistic models

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Physiology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a quality evaluation method and a device for a protein prediction three-dimensional structure, which have the technical scheme that the key points are that esmif cross entropy loss describing the sequence recovery degree is found to be in linear correlation with a structure quality evaluation function TMscore describing comparison with a real structure, and the predicted structure quality is judged by calculating the probability of a predicted sequence and the cross entropy of a reference sequence. The method comprises the following steps: inputting the reference sequence into various protein structure prediction models to obtain thousands or tens of thousands of three-dimensional prediction structures. The three-dimensional prediction structure can also be obtained by manually folding an amino acid chain, or can be obtained by manually fine-tuning on the basis of the three-dimensional prediction structure output by the prediction model. And then, reversely pushing back the sequences according to the plurality of three-dimensional predicted structures, and judging the accuracy of the three-dimensional predicted structures according to the difference between the reversely pushed back sequences and the reference sequences, so as to obtain the three-dimensional structure closest to the real protein.

Description

Quality evaluation method and device for protein prediction three-dimensional structure
Technical Field
The invention relates to the field of protein three-dimensional structure prediction, in particular to a quality evaluation method and device for protein three-dimensional structure prediction.
Background
Proteins are very important biomolecules in nature. Direct prediction of the three-dimensional structure of proteins based on amino acid sequences is a challenging problem, with significant impact on modern biology and medicine. Whether the three-dimensional structure of the protein can be accurately predicted plays a key role in understanding the function of the protein, designing the protein with a new biological function, researching and developing new medicines and the like. With the completion of the human genome project, a large number of protein amino acid sequences have been known by genome sequencing technology, and the number of new amino acid sequences obtained by sequencing analysis is still increasing at an explosive rate, while the rate of increase of the number of experimentally determined three-dimensional structures is far behind that of sequence analysis. The main experimental methods currently exist X-ray crystallography, nuclear Magnetic Resonance (NMR) and Cryo-electron microscopy (Cryo-EM). These existing methods often require a significant amount of time and expensive resources.
One major challenge in structure prediction is selecting the best three-dimensional structure from a generated pool of three-dimensional structures. Protein structure prediction models, such as Rosetta, rosettaFold, alphaFold2, can predict a large number of protein three-dimensional structures from one amino acid sequence, but it is difficult to predict which structure is closest to the native structure. Therefore, it is desired to explore a method for obtaining a predicted three-dimensional structure of a protein with high accuracy by only inputting an amino acid sequence.
Disclosure of Invention
Aiming at the defects of the prior art, one of the purposes of the invention is to provide a quality evaluation method for predicting a three-dimensional structure of a protein, which can obtain the three-dimensional structure of the protein with high accuracy without MSA.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a quality evaluation method for protein prediction three-dimensional structure comprises the following steps:
s1, predicting to obtain a plurality of prediction structures according to a reference sequence, wherein the reference sequence reflects the real distribution of the known protein amino acid sequence, and the prediction structures reflect the three-dimensional structure of the predicted protein. The predicted structure can comprise a three-dimensional structure which is more different from the real structure of the protein, and the quality requirement of the predicted structure which is initially input is lower;
s2, sequentially inputting a plurality of predicted structures into a Esm-if1 model to obtain predicted sequences corresponding to the predicted structures one by one, wherein the predicted sequences reflect probability distribution of amino acids at each site in the predicted protein amino acid sequence;
and S3, sequentially calculating multi-Class Cross Entropy (CCE) of the prediction sequence and the reference sequence to obtain the esmif cross entropy loss, and selecting a prediction structure corresponding to the minimum esmif cross entropy loss as an optimal three-dimensional structure.
Preferably, the reference sequence and the predicted sequence are each presented in a matrix, a first dimension of the matrix representing sequence position information, a second dimension of the matrix representing amino acid identity information,
the multi-classification cross entropy calculation method of the prediction sequence and the reference sequence comprises the following steps:
Figure DEST_PATH_IMAGE002
wherein CCE is multi-classification cross entropy, N is the length of a protein amino acid sequence, p is probability distribution of each amino acid in a reference sequence expressed by a single thermal code, q is probability distribution of the amino acid at each position in a predicted sequence, i is first-dimension position information of the position, and j is second-dimension amino acid identification information. The single hot code is a binary coding mode and is characterized in that of N bits used for coding the number, only one bit is 1, and the rest bits are all 0.
Preferably, the predicted structure is obtained by: inputting the reference sequence into a protein structure prediction model to obtain the amino acid sequence, manually folding the amino acid chain to obtain the amino acid sequence, or manually adjusting the amino acid chain based on the predicted structure output by the protein structure prediction model.
In view of the shortcomings of the prior art, a second object of the present invention is to provide a quality evaluation device for predicting three-dimensional structure of protein, comprising:
the prediction structure acquisition module is used for outputting a plurality of prediction structures according to a reference sequence, wherein the reference sequence reflects the real distribution of the known protein amino acid sequence, and the prediction structures reflect the three-dimensional structure of the predicted protein;
the prediction sequence acquisition module is used for sequentially inputting a plurality of prediction structures into a Esm-if1 model to obtain prediction sequences corresponding to the prediction structures one by one, wherein the prediction sequences reflect probability distribution of amino acids at each site in the predicted protein amino acid sequence;
and the structure screening module is used for sequentially calculating multi-classification cross entropy of the prediction sequence and the reference sequence to obtain the esmif cross entropy loss, and selecting a prediction structure corresponding to the minimum esmif cross entropy loss as an optimal three-dimensional structure.
In view of the shortcomings of the prior art, a third object of the present invention is to provide an electronic device, including:
processor and method for controlling the same
And a memory storing executable code that, when executed by the processor, causes the processor to perform the method of evaluating quality of a predicted three-dimensional structure of a protein described above.
Compared with the prior art, the invention has the advantages that: the esmif cross entropy loss describing the degree of sequence restoration and the structure quality evaluation function TMscore describing the comparison with the real structure are found to be linearly related, and the predicted structure quality is judged by calculating the probability of the predicted sequence and the cross entropy of the reference sequence. The method can obtain the protein three-dimensional structure with high accuracy without homologous multi-sequence alignment data (MSA).
Drawings
FIG. 1 is a matrix diagram of a reference sequence;
FIG. 2 is a matrix diagram of a predicted sequence;
FIG. 3 is a diagram of a prediction block;
FIG. 4 is a plot of 9 structural quality assessment functions TMscore and emif cross entropy loss scatter plots.
Detailed Description
The invention will now be described in further detail with reference to the drawings and examples.
Example 1
alphaFold has made remarkable progress in protein structure prediction using deep learning techniques and co-evolution information (MSA) from related protein sequences. According to the input single amino acid sequence, searching hundreds of homologous sequences to form MSA, and according to biological and physical laws, learning reliable co-evolution information from MSA, and finally folding the amino acid chain into a low potential energy state to obtain the predicted protein three-dimensional structure. However, obtaining MSA is difficult and too much dependent on MSA to explore the physical properties of protein folding, which may not accurately predict the effect of new mutations on protein structure and stability. Based on Anfinsen work, it is known that protein structure minimizes potential energy by folding. Thus, if the potential energy function can be modeled with high accuracy, the protein structure can be predicted by optimizing the function. However, this method has difficulties: how to construct this potential energy function accurately. We call the above problem a scoring problem, also called structural quality assessment (Quality Accessment, QA).
Besides the Alphafold of google, meta corporation in the united states opens up a new approach in big data driven pre-training models, trying to solve protein structure prediction and design problems from another perspective. Of these, esm-if1 is a large-scale pretrained model by Meta corporation of America [Hsu C, Verkuil R, Liu J, et al. Learning inverse folding from millions of predicted structures[J]. bioRxiv, 2022.]It attempts to predict the amino acid sequence of a protein by inputting the backbone structure of the protein. The method takes the structure of one thousand two million protein sequences predicted by alpha fold2 as a training set, and simultaneously utilizes a model GVP with geometry-invariant input processing to recover the sequences. The recovery degree of the method for the original sequence reaches 51%, the recovery rate for the buried residues reaches 72%, and the method exceeds 10% of the best algorithm in the market. The method is developed for protein design (the problem of protein reverse folding), and is a relatively advanced protein pre-training model.
The invention can input a reference sequence (an original sequence can be a wild type protein sequence) into various protein structure prediction models to obtain thousands or tens of thousands of three-dimensional prediction structures. The three-dimensional prediction structure can also be obtained by manually folding an amino acid chain, or can be obtained by manually fine-tuning on the basis of the three-dimensional prediction structure output by the prediction model. And then, reversely pushing back the sequences according to the plurality of three-dimensional predicted structures, and judging the accuracy of the three-dimensional predicted structures according to the difference between the reversely pushed back sequences and the reference sequences, so as to obtain the three-dimensional structure closest to the real protein. The method comprises the following specific steps:
s1, predicting to obtain a plurality of prediction structures according to a reference sequence, wherein the reference sequence reflects the actual distribution of the known protein amino acid sequence. As shown in FIG. 1, the first dimension of the matrix represents the sequence locus, the second dimension represents the type of amino acid, the single thermal code (other than 0 or 1) represents the probability distribution of the amino acid sequence in FIG. 1, and the color brightness of the small square blocks in the figure corresponds to the probability distribution value (the brighter the color brightness represents)The larger the value) and the color brightness of the small square in fig. 1 represents a probability value of 1. Specifically, the square in the fifth column in FIG. 1 indicates that the probability of serine (S) at the 5 th site of the protein is 1 (since the sequence is known, the type of amino acid at the first site is necessarily determined), namely P ij =P 17 =1, i represents the position of the corresponding amino acid in the sequence, j represents the amino acid type, except for the five rare natural amino acids of XBUZO added to the 20 amino acids, "" represents periods, "-" represents deletions, so j has a total of 27 different numbers (e.g. 3-29) of expressions, each number representing a different amino acid type. Specifically, the amino acids represented by 3 to 26 are 'L', 'a', 'G', 'V', 'S', 'E', 'R', 'T', 'I', 'D', 'P', 'K', 'Q', 'N', 'F', 'Y', 'M', 'H', 'W', 'C', 'X', 'B', 'U', 'Z', 'O', 'V', and 'V', in that order. As shown in fig. 3, the predicted structure reflects the three-dimensional structure of the predicted protein. The predicted structure can comprise a three-dimensional structure which is more different from the real structure of the protein, and the quality requirement of the predicted structure which is initially input is lower;
s2, sequentially inputting a plurality of prediction structures into an ESM-IF1 model to obtain prediction sequences corresponding to the prediction structures one by one, wherein the prediction sequences reflect probability distribution of amino acids at each site in the predicted protein amino acid sequence. As shown in fig. 2, the first dimension of the matrix represents the position information of the sequence locus, the second dimension represents the amino acid identification information, the color brightness of the small square block in fig. 2 corresponds to the probability distribution value, the probability distribution value corresponding to the brighter color brightness is larger (if the same column has a plurality of probability distribution values (i.e. a plurality of small square blocks), and the amino acid corresponding to the position with the largest value is selected as the prediction result). Specifically, the square in the fifth column in FIG. 2 shows that the probability of serine (S) at the 5 th site of the protein is 0.8, i.e., q ij =q 17 =0.8, while predicting amino acid at position 5 to be serine. Likewise, different expression forms can be adopted to represent probability distribution values, such as three-dimensional columns with different heights, and the like, which is not limited;
s3, sequentially calculating multi-Class Cross Entropy (CCE) of the prediction sequence and the reference sequence to obtain an esmif cross entropy loss, and selecting a prediction structure corresponding to the minimum esmif cross entropy loss as an optimal three-dimensional structure;
the multi-classification cross entropy calculation method comprises the following steps:
Figure DEST_PATH_IMAGE002A
wherein CCE is multi-classification cross entropy, N is the length of a protein amino acid sequence, p is probability distribution of each amino acid in a reference sequence expressed by a single thermal code, q is probability distribution of the amino acid at each position in a predicted sequence, i is first-dimension position information of the position, and j is second-dimension amino acid identification information. The single hot code is a binary coding mode and is characterized in that of N bits used for coding the number, only one bit is 1, and the rest bits are all 0.
The multi-class cross entropy comes from shannon's theory of information, which is mainly used herein to measure ambiguity of predicted results and true sequences, the lower the CCE, the higher the degree of sequence recovery, also indicating a higher quality of the corresponding input structure.
The operation basis of the quality assessment method of the protein prediction three-dimensional structure is as follows:
s1.1, obtaining a plurality of reference sequences and experimentally measured reference structures corresponding to the reference sequences, wherein the reference structures reflect real three-dimensional structures of proteins;
s1.2, obtaining a plurality of prediction structures according to a reference sequence;
s1.3, sequentially calculating the difference between a predicted structure and a reference structure to obtain a corresponding index TMscore for describing the quality of the structure;
s1.4, sequentially inputting a plurality of prediction structures into Esm-if1 to obtain probability distribution of a prediction sequence corresponding to the prediction structures one by one;
s1.5, sequentially calculating probability distribution of the predicted sequence and multi-class cross entropy of the reference sequence to obtain esmif cross entropy loss;
s1.6, plotting a scatter diagram by taking the esmif cross entropy loss as an abscissa and TMscore as an ordinate. As shown in FIG. 4, each TMscore and esmif cross entropy loss scatter plot of FIG. 4 is data for a single protein, where each dot represents one of the decoys, and the abscissa is the distance from the true structure, the closer to 1 (right) the more true. The ordinate is the esmif cross entropy loss of the recovered sequence, the lower the sequence the better the recovery. In particular, in FIG. 4, the 9 proteins are 1fzy, 1l3k, 1opd, 1t3y, 1z2u, 1zma, 2cxd, 2dfb, 2z0t, respectively, and the specific data sources are rosetta decoy set. We found that the emif cross entropy loss and TMscore are linearly related, so that the corresponding index TMscore describing the quality of the structure can only be obtained by calculating the multi-Class Cross Entropy (CCE) of the predicted sequence and the reference sequence. Experiments prove that the low-quality structure (decoy) dataset generated by Rosetta is fed into the model constructed by the method, and the index TMscore for measuring the decoy and the real structure can form strong negative correlation with multi-classification cross entropy. That is, when we input a low quality structure, it is difficult for the model to get probability distribution close to the original sequence. Thus, the model also allows for high quality structural quality assessment.
Example 2
A quality assessment device for protein prediction three-dimensional structure, comprising:
the prediction structure acquisition module is used for outputting a plurality of prediction structures according to a reference sequence, wherein the reference sequence reflects the real distribution of the known protein amino acid sequence, and the prediction structures reflect the three-dimensional structure of the predicted protein;
the prediction sequence acquisition module is used for sequentially inputting a plurality of prediction structures into a Esm-if1 model to obtain prediction sequences corresponding to the prediction structures one by one, wherein the prediction sequences reflect probability distribution of amino acids at each site in the predicted protein amino acid sequence;
and the structure screening module is used for sequentially calculating multi-classification cross entropy of the prediction sequence and the reference sequence to obtain the esmif cross entropy loss, and selecting a prediction structure corresponding to the minimum esmif cross entropy loss as an optimal three-dimensional structure.
Example 3
An electronic device, comprising:
processor and method for controlling the same
A memory storing executable code that, when executed by the processor, causes the processor to perform the method for quality assessment of a protein predicted three-dimensional structure as shown in embodiment 1.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention can be made by one of ordinary skill in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims (4)

1. The quality evaluation method of the protein prediction three-dimensional structure is characterized by comprising the following steps of:
s1, predicting to obtain a plurality of predicted structures according to a reference sequence, wherein the reference sequence reflects the real distribution of the known protein amino acid sequence, and the predicted structures reflect the three-dimensional structure of the predicted protein;
s2, sequentially inputting a plurality of predicted structures into a Esm-if1 model to obtain predicted sequences corresponding to the predicted structures one by one, wherein the predicted sequences reflect probability distribution of amino acids at each site in the predicted protein amino acid sequence;
s3, sequentially calculating multi-classification cross entropy of the prediction sequence and the reference sequence to obtain an esmif cross entropy loss, and selecting a prediction structure corresponding to the minimum esmif cross entropy loss as an optimal three-dimensional structure;
the reference sequence and the predicted sequence are presented in a matrix, a first dimension of the matrix representing a sequence position, a second dimension of the matrix representing a type of amino acid,
the multi-classification cross entropy calculation method of the prediction sequence and the reference sequence comprises the following steps:
Figure QLYQS_1
wherein CCE is multi-classification cross entropy, N is the length of a protein amino acid sequence, p is probability distribution of each amino acid in a reference sequence expressed by a single thermal code, q is probability distribution of the amino acid at each position in a predicted sequence, i is first-dimension position information of the position, and j is second-dimension amino acid identification information.
2. The method for evaluating the quality of a predicted three-dimensional structure of a protein according to claim 1, wherein the predicted structure is obtained by: inputting the reference sequence into a protein structure prediction model to obtain the amino acid sequence, manually folding the amino acid chain to obtain the amino acid sequence, or manually adjusting the amino acid chain based on the predicted structure output by the protein structure prediction model.
3. A quality assessment device for predicting a three-dimensional structure of a protein, comprising:
the prediction structure acquisition module is used for outputting a plurality of prediction structures according to a reference sequence, wherein the reference sequence reflects the real distribution of the known protein amino acid sequence, and the prediction structures reflect the three-dimensional structure of the predicted protein;
the prediction sequence acquisition module is used for sequentially inputting a plurality of prediction structures into a Esm-if1 model to obtain prediction sequences corresponding to the prediction structures one by one, wherein the prediction sequences reflect probability distribution of amino acids at each site in the predicted protein amino acid sequence;
the structure screening module is used for sequentially calculating multi-classification cross entropy of the prediction sequence and the reference sequence to obtain an esmif cross entropy loss, and selecting a prediction structure corresponding to the minimum esmif cross entropy loss as an optimal three-dimensional structure;
the reference sequence and the predicted sequence are presented in a matrix, a first dimension of the matrix representing a sequence position, a second dimension of the matrix representing a type of amino acid,
the multi-classification cross entropy calculation method of the prediction sequence and the reference sequence comprises the following steps:
Figure QLYQS_2
wherein CCE is multi-classification cross entropy, N is the length of a protein amino acid sequence, p is probability distribution of each amino acid in a reference sequence expressed by a single thermal code, q is probability distribution of the amino acid at each position in a predicted sequence, i is first-dimension position information of the position, and j is second-dimension amino acid identification information.
4. An electronic device, comprising:
processor and method for controlling the same
A memory storing executable code that, when executed by the processor, causes the processor to perform the method for quality assessment of a protein predicted three-dimensional structure according to any one of claims 1-2.
CN202210754951.2A 2022-06-30 2022-06-30 Quality evaluation method and device for protein prediction three-dimensional structure Active CN115273968B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210754951.2A CN115273968B (en) 2022-06-30 2022-06-30 Quality evaluation method and device for protein prediction three-dimensional structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210754951.2A CN115273968B (en) 2022-06-30 2022-06-30 Quality evaluation method and device for protein prediction three-dimensional structure

Publications (2)

Publication Number Publication Date
CN115273968A CN115273968A (en) 2022-11-01
CN115273968B true CN115273968B (en) 2023-05-12

Family

ID=83763425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210754951.2A Active CN115273968B (en) 2022-06-30 2022-06-30 Quality evaluation method and device for protein prediction three-dimensional structure

Country Status (1)

Country Link
CN (1) CN115273968B (en)

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7231301B2 (en) * 1997-01-31 2007-06-12 Japan Science And Technology Corporation Method and a system for predicting protein functional site, a method for improving protein function, and a function-modified protein
JP4876253B2 (en) * 2006-10-27 2012-02-15 国立大学法人 東京大学 Protein relative quantification method, program and system
CN109817275B (en) * 2018-12-26 2020-12-01 东软集团股份有限公司 Protein function prediction model generation method, protein function prediction device, and computer readable medium
CN111063389B (en) * 2019-12-04 2021-10-29 浙江工业大学 Ligand binding residue prediction method based on deep convolutional neural network
CN112420127B (en) * 2020-10-26 2024-08-06 大连民族大学 Non-coding RNA and protein interaction prediction method based on secondary structure and multimode fusion
CN112837740B (en) * 2021-01-21 2024-03-26 浙江工业大学 DNA binding residue prediction method based on structural characteristics
CN114333984A (en) * 2022-01-10 2022-04-12 青岛理工大学 Intelligent prediction method for small molecule-protein binding affinity
CN114613427B (en) * 2022-03-15 2023-01-31 水木未来(北京)科技有限公司 Protein three-dimensional structure prediction method and device, electronic device and storage medium

Also Published As

Publication number Publication date
CN115273968A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
CN113593631B (en) Method and system for predicting protein-polypeptide binding site
CN112233723B (en) Protein structure prediction method and system based on deep learning
CN111127493A (en) Remote sensing image semantic segmentation method based on attention multi-scale feature fusion
CN112862774B (en) Accurate segmentation method for remote sensing image building
Li et al. Automatic bridge crack identification from concrete surface using ResNeXt with postprocessing
CN112990196B (en) Scene text recognition method and system based on super-parameter search and two-stage training
CN113257357B (en) Protein residue contact map prediction method
CN109508746A (en) Pulsar candidate's body recognition methods based on convolutional neural networks
CN113435461B (en) Point cloud local feature extraction method, device, equipment and storage medium
CN114511710A (en) Image target detection method based on convolutional neural network
CN113420619A (en) Remote sensing image building extraction method
CN117173568A (en) Target detection model training method and target detection method
CN115908909A (en) Evolutionary neural architecture searching method and system based on Bayes convolutional neural network
CN114093415B (en) Peptide fragment detectability prediction method and system
CN115273968B (en) Quality evaluation method and device for protein prediction three-dimensional structure
CN112908421A (en) Tumor neogenesis antigen prediction method, device, equipment and medium
CN112529057A (en) Graph similarity calculation method and device based on graph convolution network
CN114758721B (en) Deep learning-based transcription factor binding site positioning method
CN114036326B (en) Image retrieval and classification method, system, terminal and storage medium
Wan et al. RSSM-Net: Remote sensing image scene classification based on multi-objective neural architecture search
CN114220013A (en) Camouflaged object detection method based on boundary alternating guidance
CN113077009A (en) Tunnel surrounding rock lithology identification method based on migration learning model
CN113851192B (en) Training method and device for amino acid one-dimensional attribute prediction model and attribute prediction method
CN112098291A (en) Electric imaging porosity spectrum-based permeability calculation method and system
CN117292747B (en) Space transcriptome spot gene expression prediction method based on HSIC-bottleneck

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant