CN116504302A - Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry - Google Patents
Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry Download PDFInfo
- Publication number
- CN116504302A CN116504302A CN202310736846.0A CN202310736846A CN116504302A CN 116504302 A CN116504302 A CN 116504302A CN 202310736846 A CN202310736846 A CN 202310736846A CN 116504302 A CN116504302 A CN 116504302A
- Authority
- CN
- China
- Prior art keywords
- hbv
- capsid protein
- capsid
- hepatitis
- screening
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 210000000234 capsid Anatomy 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 56
- 241000700721 Hepatitis B virus Species 0.000 title claims abstract description 36
- 238000013461 design Methods 0.000 title claims description 28
- 238000003041 virtual screening Methods 0.000 title claims description 28
- 150000003384 small molecules Chemical class 0.000 claims abstract description 83
- 108090000565 Capsid Proteins Proteins 0.000 claims abstract description 77
- 102100023321 Ceruloplasmin Human genes 0.000 claims abstract description 77
- 238000012216 screening Methods 0.000 claims abstract description 44
- 238000003032 molecular docking Methods 0.000 claims abstract description 24
- 230000007704 transition Effects 0.000 claims abstract description 21
- 238000000329 molecular dynamics simulation Methods 0.000 claims abstract description 18
- 238000004458 analytical method Methods 0.000 claims abstract description 16
- 229940079593 drug Drugs 0.000 claims abstract description 16
- 239000003814 drug Substances 0.000 claims abstract description 16
- 238000009826 distribution Methods 0.000 claims abstract description 14
- 230000000694 effects Effects 0.000 claims abstract description 14
- 150000002611 lead compounds Chemical class 0.000 claims abstract description 8
- 239000013636 protein dimer Substances 0.000 claims abstract description 5
- 239000003446 ligand Substances 0.000 claims description 28
- 238000012549 training Methods 0.000 claims description 26
- 239000003607 modifier Substances 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 21
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 claims description 20
- 229910052799 carbon Inorganic materials 0.000 claims description 20
- 150000001875 compounds Chemical class 0.000 claims description 18
- 102000004169 proteins and genes Human genes 0.000 claims description 16
- 108090000623 proteins and genes Proteins 0.000 claims description 16
- 238000004088 simulation Methods 0.000 claims description 14
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 claims description 9
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 9
- 239000011701 zinc Substances 0.000 claims description 9
- 229910052725 zinc Inorganic materials 0.000 claims description 9
- 239000000539 dimer Substances 0.000 claims description 8
- 238000000692 Student's t-test Methods 0.000 claims description 7
- 238000005556 structure-activity relationship Methods 0.000 claims description 7
- 238000012353 t test Methods 0.000 claims description 7
- 241000283153 Cetacea Species 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 230000010534 mechanism of action Effects 0.000 claims description 5
- 208000002672 hepatitis B Diseases 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 230000004071 biological effect Effects 0.000 claims description 3
- 238000007635 classification algorithm Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000000087 stabilizing effect Effects 0.000 claims description 3
- 102000023732 binding proteins Human genes 0.000 claims description 2
- 108091008324 binding proteins Proteins 0.000 claims description 2
- 239000000126 substance Substances 0.000 abstract description 3
- 125000004429 atom Chemical group 0.000 description 18
- 230000008569 process Effects 0.000 description 11
- 239000002904 solvent Substances 0.000 description 11
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 10
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 8
- 239000001257 hydrogen Substances 0.000 description 7
- 229910052739 hydrogen Inorganic materials 0.000 description 7
- 238000007614 solvation Methods 0.000 description 6
- 150000002632 lipids Chemical class 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- OQIUTYABZMBBME-FMQUCBEESA-N n-[(e)-1-bromo-1-(2-methoxyphenyl)-3-oxo-3-piperidin-1-ylprop-1-en-2-yl]-4-nitrobenzamide Chemical compound COC1=CC=CC=C1\C(Br)=C(C(=O)N1CCCCC1)/NC(=O)C1=CC=C([N+]([O-])=O)C=C1 OQIUTYABZMBBME-FMQUCBEESA-N 0.000 description 5
- 150000001413 amino acids Chemical class 0.000 description 4
- 239000000460 chlorine Substances 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 239000000178 monomer Substances 0.000 description 4
- 229910052757 nitrogen Inorganic materials 0.000 description 4
- 102000014150 Interferons Human genes 0.000 description 3
- 108010050904 Interferons Proteins 0.000 description 3
- SQOFSIXYJGPNKV-UHFFFAOYSA-N N-(3,4-difluorophenyl)-1,3,5-trimethyl-4-[2-oxo-2-(prop-2-ynylamino)acetyl]pyrrole-2-carboxamide Chemical compound C(C#C)NC(C(=O)C=1C(=C(N(C=1C)C)C(=O)NC1=CC(=C(C=C1)F)F)C)=O SQOFSIXYJGPNKV-UHFFFAOYSA-N 0.000 description 3
- 238000005984 hydrogenation reaction Methods 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 229940079322 interferon Drugs 0.000 description 3
- 238000012900 molecular simulation Methods 0.000 description 3
- 229940127073 nucleoside analogue Drugs 0.000 description 3
- SMKBSSHVLHIPLU-UHFFFAOYSA-N 3-acridin-10-ium-10-ylpropane-1-sulfonate Chemical compound C1=CC=C2[N+](CCCS(=O)(=O)[O-])=C(C=CC=C3)C3=CC2=C1 SMKBSSHVLHIPLU-UHFFFAOYSA-N 0.000 description 2
- KKMFSVNFPUPGCA-UHFFFAOYSA-N 4-fluoro-3-(4-hydroxypiperidin-1-yl)sulfonyl-n-(3,4,5-trifluorophenyl)benzamide Chemical compound C1CC(O)CCN1S(=O)(=O)C1=CC(C(=O)NC=2C=C(F)C(F)=C(F)C=2)=CC=C1F KKMFSVNFPUPGCA-UHFFFAOYSA-N 0.000 description 2
- WKBOTKDWSSQWDR-UHFFFAOYSA-N Bromine atom Chemical compound [Br] WKBOTKDWSSQWDR-UHFFFAOYSA-N 0.000 description 2
- 101710132601 Capsid protein Proteins 0.000 description 2
- 108091036055 CccDNA Proteins 0.000 description 2
- 241000251204 Chimaeridae Species 0.000 description 2
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical compound [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 description 2
- PXGOKWXKJXAPGV-UHFFFAOYSA-N Fluorine Chemical compound FF PXGOKWXKJXAPGV-UHFFFAOYSA-N 0.000 description 2
- 208000025174 PANDAS Diseases 0.000 description 2
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 2
- 240000004718 Panda Species 0.000 description 2
- 235000016496 Panda oleosa Nutrition 0.000 description 2
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- GDTBXPJZTBHREO-UHFFFAOYSA-N bromine Substances BrBr GDTBXPJZTBHREO-UHFFFAOYSA-N 0.000 description 2
- 229910052794 bromium Inorganic materials 0.000 description 2
- 229910052801 chlorine Inorganic materials 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000011737 fluorine Substances 0.000 description 2
- 229910052731 fluorine Inorganic materials 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- FVNJBPMQWSIGJK-HNNXBMFYSA-N methyl (4r)-4-(2-chloro-4-fluorophenyl)-2-(3,5-difluoropyridin-2-yl)-6-methyl-1,4-dihydropyrimidine-5-carboxylate Chemical compound C1([C@@H]2N=C(NC(C)=C2C(=O)OC)C=2C(=CC(F)=CN=2)F)=CC=C(F)C=C1Cl FVNJBPMQWSIGJK-HNNXBMFYSA-N 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 230000036544 posture Effects 0.000 description 2
- 239000011593 sulfur Substances 0.000 description 2
- 229910052717 sulfur Inorganic materials 0.000 description 2
- 231100000331 toxic Toxicity 0.000 description 2
- 230000002588 toxic effect Effects 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 1
- 208000000419 Chronic Hepatitis B Diseases 0.000 description 1
- 108020004414 DNA Proteins 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 206010016654 Fibrosis Diseases 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- 238000004617 QSAR study Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000003905 agrochemical Substances 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000007882 cirrhosis Effects 0.000 description 1
- 208000019425 cirrhosis of liver Diseases 0.000 description 1
- 210000000172 cytosol Anatomy 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000001212 derivatisation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 238000007876 drug discovery Methods 0.000 description 1
- 230000007515 enzymatic degradation Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 231100000283 hepatitis Toxicity 0.000 description 1
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- 125000004435 hydrogen atom Chemical class [H]* 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 150000002500 ions Chemical group 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 150000002678 macrocyclic compounds Chemical class 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004879 molecular function Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/20—Protein or domain folding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C10/00—Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Abstract
The invention discloses a novel method for designing and virtually screening a hepatitis B virus capsid assembly regulator from scratch based on a generation model and computational chemistry, which comprises the following steps: predicting HBV full-length wild type core capsid protein dimer structure; generating a candidate small molecule database by using a pre-trained GENTRL generation model; performing skeleton transition screening by using the similarity of the five principle combination structures of the quasi drugs; screening small molecules with excellent binding modes; the free energy of binding between the small molecule and HBV capsid protein is calculated, and capsid assembly regulator with anti-HBV activity is screened. According to the invention, through framework transition, molecular docking and capsid protein stability analysis based on molecular dynamics simulation, effective information HBV capsid assembly regulator is screened, new molecules are effectively generated, candidate molecule chemical space distribution is increased, potential capsid assembly regulator can be more accurately captured through CTD stability analysis, and the discovery speed of lead compounds is obviously accelerated.
Description
Technical Field
The invention relates to the field of virtual screening of lead compounds, in particular to a novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry.
Background
Chronic hepatitis b virus (Hepatitis B virus, HBV) is an infectious virus, with about 3 hundred million HBV carriers worldwide, and 100 million people each year dying from cirrhosis, hepatocellular carcinoma and its complications due to HBV infection. The existing anti-HBV drugs on the market are mainly interferon and nucleoside analogues, wherein the interferon acts on the transcription process, and the nucleoside analogues act on the reverse transcription process. Both drugs cannot eradicate intracellular cccDNA, so long-term administration is required, interferon is expensive and has side effects, nucleoside analogues are easy to generate drug resistance, the prognosis quality of HBV infection is not ideal, and development of novel anti-HBV drugs is urgently needed.
The HBV capsid core protein monomer is composed of 183 amino acids, comprises two different domains, residues 1-149 are nitrogen end domains, namely NTD,150-183 are carbon end domains, namely CTD, and the CTD is very flexible and plays a plurality of roles in HBV life cycle. CTD has many arginine-rich regions that interact with RNA to initiate capsid assembly, and existing capsid protein structures are partially deleted in CTD, and the mechanism of action of full-length proteins with CAMs has not been explored.
It has been shown that capsid assembly modifiers (Capsid Assembly Modulator, CAMs) act on capsid proteins as novel anti-HBV agents, promoting capsid assembly or abnormal capsid assembly, which, when the virus enters the cell, can accelerate capsid assembly during the virus-to-cell or capsid assembly, exposing DNA to the cytosol for enzymatic degradation, which provides the opportunity to eradicate intracellular cccDNA, thereby radically treating hepatitis b.
Virtual screening is a screening means of accelerating lead compounds combining structure biology and computational chemistry, is based on the drug design guidance thought of ligand structures and receptor structures, and can accelerate screening and discovery of target active molecules based on the existing small molecule data and protein structures. In the prior art, CAMs have been found to be slow by virtue of the known transformation of the backbone derivatization. The predicted binding mode between small molecules and capsid proteins is mainly to carry out molecular docking by using the capsid protein structure of the deletion CTD and the small molecules, and the action mechanism of the small molecules and the capsid proteins is not clear although the binding site is clear, and the docking fraction and the activity of the small molecules have no correlation. Therefore, the prior art has limited capacity of screening CAMs and does not unify and define screening standards, so that the construction of a brand new virtual screening method of potential anti-HBV drugs has great significance.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a novel method for designing and virtually screening the novel hepatitis B virus capsid assembly regulator from the head based on a generation model and computational chemistry, based on a novel mechanism of action of CAMs and HBV capsid proteins, uses a GENTRL molecular generation model to learn the characteristics of the existing target molecules, generates a candidate small molecule database with target attributes, realizes the novel CAMs from the head design and virtually screening through framework transition, molecular docking, molecular dynamics simulation and free energy calculation, and can effectively overcome the technical problems existing in the prior art.
For this purpose, the invention adopts the following specific technical scheme:
novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generative model and computational chemistry, the method comprising the steps of:
s1, constructing a full-length hepatitis B capsid protein structure: acquiring the amino acid sequence of the full-length wild type core capsid protein of the hepatitis B virus, and predicting the dimer structure of the full-length wild type core capsid protein of the HBV;
s2, generating and constructing a candidate small molecule database: training a GENTRL generation model by using the obtained training compound set, and generating a candidate small molecule database by using the pre-trained GENTRL generation model;
s3, constructing and screening a skeleton transition model: preliminary screening is carried out on a candidate micromolecule database by using five principles of quasi drugs, euclidean distance between a database molecule and a target molecule is calculated based on WHALES descriptors, and skeleton transition screening is carried out according to structural similarity;
s4, activity screening and binding mode prediction based on molecular docking: docking the small molecules with HBV capsid protein by utilizing molecular docking software, predicting the combination mode of the small molecules and HBV capsid protein, and screening the small molecules with excellent combination mode;
s5, predicting and screening based on molecular dynamics simulation structure-activity relationship: the stability of the small molecules to HBV capsid protein carbon end domain is analyzed by utilizing molecular dynamics simulation software combined track analysis package, a structure-activity relation model is constructed, the EC50 of the small molecules is predicted, the combined free energy of the small molecules and HBV capsid protein is calculated, and the capsid assembly regulator with anti-HBV activity is screened.
Preferably, the obtaining the amino acid sequence of the full-length wild-type core capsid protein of hepatitis b virus and predicting the dimer structure of the full-length wild-type core capsid protein of HBV comprises the steps of:
s11, obtaining the amino acid sequence of the full-length wild type core capsid protein of the hepatitis B virus from NCBI biological information database;
s12, predicting the dimer structure of HBV full-length wild type core capsid protein by using a homomultimer prediction model of Alpha Fold2, and performing energy optimization.
Preferably, the training compound set is obtained based on ChEMBL and ZINC databases;
the training data in the training compound set comprises an HBV capsid assembly modifier, a common capsid modifier and an extended connectivity fingerprint of ZINC random molecules;
the training of the GENTRL generation model comprises a variational self-encoder, a hidden space probability distribution, a generator and a reward function based on an SVM classification algorithm.
The molecular activity threshold of the capsid assembly regulator and the common capsid regulator aiming at HBV is set to 10000nM, the molecular weight and the lipid water distribution coefficient distribution of ZINC random molecules are consistent with the capsid assembly regulator, and the expanded connectivity fingerprint selects a Morgan fingerprint with radius of 2 and number of bits of 2048.
Preferably, the preliminary screening of the candidate small molecule database by using the penta-principle of quasi drugs, calculating Euclidean distance between the database molecules and target molecules based on WHALS descriptors, and performing skeleton transition screening according to structural similarity comprises the following steps:
s31, utilizing the five principles of Li Binsi-based drugs to perform preliminary screening on a candidate small molecule database, and generating a 3D structure for small molecules by using RDkit and OPENBABEL;
s32, calculating a small molecule 3D descriptor by using WHALS, obtaining Euclidean distance between each database molecule and the target compound, and performing skeleton transition according to Euclidean distance sequencing.
Preferably, the docking of the small molecule with HBV capsid protein using molecular docking software comprises:
the method comprises the steps of using Alpha Fold2 predicted HBV full-length wild type core capsid protein dimer as a receptor structure, and using Chimer and Maestro to perform three-dimensional structure optimization, hydrogenation, atom charge amount calculation and other pretreatment on the receptor;
pretreatment of small molecule ligands using RDKit and OPENBABEL;
the docking software is SMINA, scores are carried out on each small molecule according to affinity, 9 different docking postures are generated on each molecule, the first scoring is used as the docking score of the molecule, and the first 10 compounds are selected for subsequent molecular dynamics simulation screening.
The invention also relates to a novel mechanism of interaction of all novel CAMs and HBV capsid proteins, the mechanism is specifically that the stability of CTD is the key of HBV capsid protein assembly rate, CAMs can be combined on HBV capsid protein active sites, and HBV capsid protein assembly rate is accelerated by stabilizing the CTD of HBV capsid protein, so that blank HBV capsid is formed, and HBV replication is inhibited.
Preferably, the mechanism is validated using 30ns molecular dynamics simulations and trajectory analysis using five small molecules known as CAMs, including AT-130, GLP-26, NVR-3-778, BAY-41-4109, and SPA, to calculate CTD stability.
Preferably, the analysis of the stability of small molecules to HBV capsid protein carbon end domains using molecular dynamics simulation software in combination with trajectory analysis package comprises the steps of:
preparing a simulation input file by using CHARMM-GUI, simulating 30ns by using CHARMM36 molecular force field and OPENMM software, and generating a 300-frame track file;
converting the interaction of HBV capsid protein and a small molecule into a dcd file comprising a 3D trajectory, wherein the dcd file comprises positions of 300 frames during each atomic simulation of HBV capsid protein and ligand;
calculating a stability index of the carbon end domain of HBV capsid protein by MDtraj reading the dcd file, wherein the stability index is RMSF and RMSD of residues 150-183:
wherein N is the total number of atoms,the square sum of the position offset of the ith atom of the current frame and the ith atom of the target frame comprises the square sum of the position offset of the X-axis, the Y-axis and the Z-axis, T is the analog total duration, and +.>Cartesian coordinates of atoms at time tj, +.>Is the cartesian coordinates of an atom at the initial moment.
Preferably, the calculation of the stability is based on a molecular dynamics simulation system based on the binding of a pre-large amount of HBV capsid protein to known CAMs, a new mechanism of action of CAMs with HBV capsid protein is found, and is that CAMs accelerate capsid assembly by stabilizing HBV capsid protein carbon end domains.
Preferably, the structure-activity relationship model uses a carbon end domain RMSD of a ligand-free protein mimetic system to perform t-test with a carbon end domain RMSD of a small molecule ligand-bound protein mimetic system, calculates a p-value, and predicts a small molecule EC50 by the p-value:
in the method, in the process of the invention,and->Is the mean of two samples RMSD, m and n are the sizes of two data sets, +.>And->Is an unbiased estimate of the variance of the two data sets, t is calculated by a formula, a P value is calculated by using a t-test table, and a molecule with a t-test P value less than 0.05 is selected to enter a subsequent free energy calculation step.
Preferably, the calculation of the binding free energy is based on a dcd file and a simulated input file generated by simulation, the binding free energy of the small molecule ligand and HBV capsid protein is calculated by using Parmed and AMBER, and the final lead compound is screened for biological activity verification by comparing with the binding free energy of the known capsid assembly modifier.
Wherein, the calculation equation of the combined free energy is:
in the method, in the process of the invention,: solvent system protein receptor-ligand binding free energy;
: vacuum system protein receptor-ligand binding free energy;
: solvent system protein-ligand complex solvation free energy;
: solvent system ligand solvation free energy;
: solvent system protein acceptor solvation free energy.
Preferably, the binding free energy is compared with the binding free energy of the known capsid assembly modifier GLP-26 and the final lead compound is screened for biological activity verification.
The beneficial effects of the invention are as follows: based on the existing small molecule data, a new small molecule candidate data set which never appears is generated by using a GENTRL model through constructing a de novo design screening method, and effective information HBV capsid assembly regulator is screened through framework transition, molecular docking and capsid protein stability analysis based on molecular dynamics simulation. According to the method, new molecules are effectively generated, chemical spatial distribution of candidate molecules is increased, potential capsid assembly regulators can be more accurately captured through CTD stability analysis, and the discovery speed of lead compounds is remarkably accelerated.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a novel hepatitis B virus capsid assembly modifier de novo design and virtual screening method based on generative model and computational chemistry in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of a novel design of a novel hepatitis B virus capsid assembly modifier de novo and virtual screening method based on generative model and computational chemistry in accordance with an embodiment of the present invention;
FIG. 3 is a schematic diagram of the structure of full length wild type HBV capsid protein in a novel design from scratch and virtual screening method of a hepatitis B virus capsid assembly modifier based on generative model and computational chemistry according to an embodiment of the present invention;
FIG. 4 is a diagram showing the construction of a GENTRL production model in a novel hepatitis B virus capsid assembly modifier de novo design and virtual screening method based on production model and computational chemistry according to an embodiment of the present invention;
FIG. 5 is a flow chart of backbone transitions in a novel design from scratch and virtual screening method for hepatitis B virus capsid assembly modifier based on generative model and computational chemistry in accordance with an embodiment of the present invention;
FIG. 6 is a visual representation of novel mechanisms of CAMs and HBV capsid proteins in a de novo design and virtual screening method for novel hepatitis B virus capsid assembly modifiers based on generative model and computational chemistry in accordance with an embodiment of the present invention;
FIG. 7 is a flow chart of a molecular dynamics simulation method in a novel design and virtual screening method of a hepatitis B virus capsid assembly modifier based on generative model and computational chemistry in accordance with an embodiment of the present invention;
FIG. 8 is a schematic diagram showing calculation of CTD stability in a novel hepatitis B virus capsid assembly modifier de novo design and virtual screening method based on generative model and computational chemistry in accordance with an embodiment of the present invention;
FIG. 9 is a flow chart showing the method of generating model and computational chemistry based novel hepatitis B virus capsid assembly modifier from the de novo design and virtual screening method according to the present invention.
Detailed Description
For the purpose of further illustrating the various embodiments, the present invention provides the accompanying drawings, which are a part of the disclosure of the present invention, and which are mainly used to illustrate the embodiments and, together with the description, serve to explain the principles of the embodiments, and with reference to these descriptions, one skilled in the art will recognize other possible implementations and advantages of the present invention, wherein elements are not drawn to scale, and like reference numerals are generally used to designate like elements.
In accordance with embodiments of the present invention, novel methods for de novo design and virtual screening of hepatitis B virus capsid assembly modulators based on generative model and computational chemistry are provided.
The present invention will now be further described with reference to the accompanying drawings and detailed description, as shown in fig. 1 to 9, a novel de novo design and virtual screening method for hepatitis b virus capsid assembly regulator based on generative model and computational chemistry according to an embodiment of the present invention, the method comprising the steps of:
s1, constructing a full-length hepatitis B capsid protein structure: the method for obtaining the amino acid sequence of the full-length wild type core capsid protein of the hepatitis B virus and predicting the dimer structure of the full-length wild type core capsid protein of the HBV comprises the following steps:
s11, obtaining the amino acid sequence of the full-length wild type core capsid protein of the hepatitis B virus from NCBI biological information database;
s12, predicting the dimer structure of HBV full-length wild type core capsid protein by using a homomultimer prediction model of Alpha Fold2, and performing energy optimization;
wherein, HBV capsid protein is a key component of HBV structure, HBV capsid is a 20-face body with a size of about 22nm, each core protein monomer is composed of 183 amino acid groups, and is divided into two domains, namely Nitrogen Terminal Domain (NTD): 1-149; carbon end domain (CTD): 150-183. HBV capsid proteins that have been resolved at present contain only amino acids 1-155.
Using the Alpha Fold2 multimer model, two monomer amino acid sequences were input, the multimer_model_max_num_recycles parameter of the model was set to 3, then the model was performed, the model output was followed by selection of the first-ranked model result, and the full-length HBV capsid protein results constructed are shown in fig. 3.
S2, generating and constructing a candidate small molecule database: training a GENTRL generation model by using the obtained training compound set, and generating a candidate small molecule database by using the pre-trained GENTRL generation model;
the training data in the training compound set comprises an HBV capsid assembly modifier, a common capsid modifier and an extended connectivity fingerprint of ZINC random molecules; the training of the GENTRL generation model comprises a variational self-encoder, a hidden space probability distribution, a generator and a reward function based on an SVM classification algorithm. The molecular activity threshold of the capsid assembly regulator and the common capsid regulator aiming at HBV is set to 10000nM, the molecular weight and the lipid water distribution coefficient distribution of ZINC random molecules are consistent with the capsid assembly regulator, and the expanded connectivity fingerprint selects a Morgan fingerprint with radius of 2 and number of bits of 2048.
In order to generate novel small molecular structures which do not exist in reality and ensure that the small molecules have properties similar to those of target analysis, the invention designs a molecular generation algorithm model, and the model structure is shown in figure 4.
Wherein the training data is obtained by collecting known assembly regulator for HBV capsid from the chumbl database as training data 1 and common capsid assembly regulator as training data 2. The average of the molecular weight and the lipid water distribution coefficient of dataset 1 was then calculated using RDKit, and 10 ten thousand random molecules conforming to the average distribution were screened from the ZINC dataset using the average as training data 3. All three data sets were removed of structures containing atoms other than carbon, nitrogen, oxygen, sulfur, fluorine, chlorine, bromine and hydrogen, and conventional pharmacokinetic filters MCF and pans were used to exclude compounds with potentially toxic and reactive groups, and then the molecules in all data sets were subjected to a unified SMILES normalization such that all molecules were generated in the same SMILES encoding direction.
Next, training a variational self-encoder and a priori distributed lipid-water distribution coefficient (lovp) and a synthetic difficulty coefficient (SAscore) on the three data sets is an important molecular property for judging whether a molecule has a drug-like property, and is important in the fields of drug discovery, agricultural chemicals discovery and the like, so that the molecular property in the model is selected from lipid-water distribution coefficient (penolized lovp) containing penalty terms, and the calculation formula is as follows:
wherein rings6 is a "penalty" for molecules with more than 6 atoms in the molecular carbocycle, avoiding indiscriminate formation of unrealistic macrocycles. Firstly, training a ZINC molecular data set to enable a model to learn the characteristics of a conventional son; and secondly, training the data set 1 and the data set 2 simultaneously so that the model can learn the special target characteristics. By training the model, a mapping relationship from chemical space to hidden space is obtained. This mapping also relates the relationship between molecules and their properties.
In this example, the model before and after optimization was randomly sampled 50000 times from the hidden space obtained by training, the structure containing atoms other than carbon, nitrogen, oxygen, sulfur, fluorine, chlorine, bromine and hydrogen was removed, and a candidate small molecule dataset was obtained using a conventional pharmaceutical chemistry filter MCF which pans were used to exclude compounds with potentially toxic and reactive groups.
S3, constructing and screening a skeleton transition model: preliminary screening is carried out on a candidate small molecule database by using a quasi-drug penta, euclidean distance between a database molecule (namely a molecule in the candidate small molecule database) and a target molecule (namely a known capsid assembly regulator AT-130) is calculated based on a WHALES descriptor, and skeleton transition screening is carried out according to structural similarity;
the method for screening the candidate micromolecule database by utilizing the penta of the quasi-drugs comprises the following steps of calculating Euclidean distance between a database molecule and a target molecule by using a WHALES descriptor, and carrying out skeleton transition screening according to structural similarity:
s31, utilizing the five principles of Li Binsi-based drugs to perform preliminary screening on a candidate small molecule database, and generating a 3D structure for small molecules by using RDkit and OPENBABEL;
s32, calculating a small molecule 3D descriptor by using WHALS, obtaining Euclidean distance between each database molecule and the target compound, and performing skeleton transition according to Euclidean distance sequencing.
The flow of the backbone transition model is shown in FIG. 5, and the present invention uses RDkit to calculate the hydrogen bond donor, hydrogen bond acceptor, molecular weight, intramolecular alternative bonding and lipid partition coefficients of candidate small molecules prior to backbone transition.
Preferably, the five-element screening threshold of the quasi-drug is that the hydrogen bond donor is less than or equal to 5, the hydrogen bond acceptor is less than or equal to 10, the molecular weight is less than or equal to 500Da, the number of rotatable bonds in the molecule is less than or equal to 10 and the lipid water distribution coefficient is less than or equal to 5.
Optionally, when constructing a framework transition model, a 3D structure model needs to be built for each molecule, the invention uses RDKit package and OPENBABEL to generate a three-dimensional structure for the molecule screened by the quasi-drug principle, optimizes the three-dimensional structure by using an eded molecular function and MMFF994 molecular force field, and then calculates the garteiger charge of each atom in the molecule. Using AT-130 as the target molecule, the charge and three-dimensional structure of the AT-130 molecule are prepared in the same manner. The do_whales module is then used to calculate the mahalanobis distance and the WHALES descriptor for the template target molecule and candidate database molecules.
According to WHALS descriptors of target molecules and candidate database molecules, euclidean distance calculation module Euclidean_distances is used for calculating Euclidean distances of the target molecules and the candidate database molecules, sorting is carried out according to the distances from small to large, and the compounds with 20% of the top ranking are selected for molecular docking screening.
S4, activity screening and binding mode prediction based on molecular docking: docking the small molecules with HBV capsid protein by utilizing molecular docking software, predicting the combination mode of the small molecules and HBV capsid protein, and screening the small molecules with excellent combination mode;
wherein, the docking of the small molecules with HBV capsid protein using molecular docking software comprises:
the method comprises the steps of using Alpha Fold2 predicted HBV full-length wild type core capsid protein dimer as a receptor structure, and using Chimer and Maestro to perform three-dimensional structure optimization, hydrogenation, atom charge amount calculation and other pretreatment on the receptor; pretreatment of small molecule ligands using RDKit and OPENBABEL;
the docking software is SMINA, scores are carried out on each small molecule according to affinity, 9 different docking postures are generated on each molecule, the first scoring is used as the docking score of the molecule, and the first 10 compounds are selected for subsequent molecular dynamics simulation screening.
Preferably, the protein receptor of the present invention is obtained from a model of full-length wild-type HBV capsid protein structure predicted by Alpha Fold2, and the monomer length is a dimer structure of 183 amino acid residues. The docking binding pocket was set using the position of small molecule ligand binding in the PDB ID 5T2P structure.
Pretreatment of proteins protein receptors were hydrogenated and Gasteiger charged by the dock Pre module of Chimera software, followed by Minimization of protein energy using the Minimization module. Pretreatment of a small molecular receptor generates a 2D structure of a molecule from the molecules Smiles through an encoding function of RDkit and carries out hydrogenation, the maxAttempts parameter is 100, the random seed is 0xf00D, then the 2D structure is optimized into a 3D structure by using a UFF force field, and the maximum iteration number is 1000. The small molecule ligands are charged and converted using OPENBABEL. In the molecular docking process, the Vina and the expansion program SMINA of Vina are used for docking, the random seed is 0, the docking site is selected as an original small molecule ligand binding site, the exhaustiveness parameter is 24, each small molecule is scored according to the affinity, and the 10 compounds with the top scoring rank are taken for subsequent molecular dynamics simulation screening.
S5, predicting and screening based on molecular dynamics simulation structure-activity relationship: the stability of the small molecules to HBV capsid protein carbon end domain is analyzed by utilizing molecular dynamics simulation software combined track analysis package, a structure-activity relation model is constructed, the EC50 of the small molecules is predicted, the combined free energy of the small molecules and HBV capsid protein is calculated, and the capsid assembly regulator with anti-HBV activity is screened.
Preferably, the input structure of the molecular dynamics simulation of the present invention is the protein-ligand complex predicted by S4, and the molecular dynamics simulation steps are shown in fig. 7. The preparation of the molecular simulation input file is completed by the solution builder of the CHARMM-GUI online server, a proper period boundary and a water box are created for each biological system, the water box boundary is set to be more than 10A from the protein boundary, the water box is filled with a TIP3P water molecule solvent model, and meanwhile K+Cl < - > is added to neutralize redundant charges in the system, so that the final concentration of K+Cl < - > is kept to be 0.15M.
The molecular simulation process is completed by OPENMM software, the non-bonding method parameter is Particle-Mesh Ewald (PME), the hydrogen bond parameter is selected from restricts, the simulation temperature is selected from 303.15 Kelvin, and the simulation pressure is selected from normal atmospheric pressure. The code of the molecular simulation is completely written by a Python programming language, python packages such as OPENMM, pandas, numpy and the like, the capability of the system is minimized by using a minimization function, then non-limiting simulation of 30ns is completed by using OPENMM, 15000000 steps are performed, a track file is intercepted every 50000 steps, 300 frames of the track file are generated, interaction of protein and small molecules is converted into a dcd file containing 3D tracks, the dcd file contains positions of 300 frames in each atomic simulation process of HBV capsid protein and ligand, and energy in the simulation process is recorded every 1000 steps.
Preferably, all molecular system track files generated by OPENMM simulation are completed by MDTraj, MDAnalysis, pandas, numpy, matplotlib and other packet analysis in Python language, and due to the existence of cycle boundary conditions, the generated track needs to be subjected to structural centering by MDTraj, and the MDtraj is used for reading dcd files to calculate the stability index of the capsid protein carbon end domain. The stability index is RMSF and RMSD of residues 150-183:
wherein N is the total number of atoms,the square sum of the position offset of the ith atom of the current frame and the ith atom of the target frame comprises the square sum of the position offset of the X-axis, the Y-axis and the Z-axis, T is the analog total duration, and +.>Cartesian coordinates of atoms at time tj, +.>Is the cartesian coordinates of an atom at the initial moment.
All trajectories are superimposed with the original structure and RMSDs between all atoms are calculated, while CA atoms in different amino acids are selected to calculate RMSFs for different amino acids. RMSD of CTD is obtained by calculating RMSD of CTD between adjacent frame structures over successive time periods, the calculation process is shown in fig. 8.
Preferably, the structure-activity relationship model uses a carbon end domain RMSD of a ligand-free protein mimetic system and a carbon end domain RMSD of a small molecule ligand-binding protein mimetic system to perform t-test, calculates p-value, and predicts small molecule EC50 through the p-value:
in the method, in the process of the invention,and->Is the mean of two samples RMSD, m and n are the sizes of two data sets, +.>And->Is an unbiased estimate of the variance of the two data sets, t is calculated by a formula, a P value is calculated by using a t-test table, and a molecule with a t-test P value less than 0.05 is selected to enter a subsequent free energy calculation step.
Preferably, the stability index of CTD is obtained by the calculation, and the novel mechanism of action of CAMs and capsid proteins shown in FIG. 6 is used for carrying out 30ns molecular dynamics simulation and trajectory analysis on five small molecules of AT-130, GLP-26, NVR-3-778, BAY-41-4109 and SPA, constructing a small molecule structure-activity relationship regression model, and predicting the EC50 of candidate small molecules through the regression model.
The protein-ligand binding free energy calculation flow is shown in FIG. 9, and psfs and crd generated by CHARMM-GUI are converted into prmtop and inprd files of a simulation system by using Parmed starting from the simulation input files. Then, the ligand residue numbers to be separated, solvent and ion residue names are designated, the ante-MMPBSA is used for generating receptor, ligand, complex, prmtop and inprd files of the solvent, finally MMPBSA calculation script is used for calculating the free energy of protein receptor-ligand binding, the track of 150-300 frames is calculated, and the calculation interval is 2 frames. And (3) calculating the binding free energy of the candidate small molecule ligand-protein, comparing the binding free energy with the GLP 26-protein, screening the candidate small molecule which solves the binding free energy of the target molecule to be a lead compound, and carrying out subsequent biological experiment verification.
Wherein, the calculation equation of the combined free energy is:
in the method, in the process of the invention,: solvent system protein receptor-ligand binding free energy;
: vacuum system protein receptor-ligand binding free energy;
: solvent system protein-ligand complex solvation free energy;
: solvent system ligand solvation free energy;
: solvent system protein acceptor solvation free energy.
In summary, by means of the above technical solution of the present invention, a method for de novo design and virtual screening of HBV capsid assembly modulators is constructed by a genetrl generation model and a variety of qualitative and quantitative structure-activity relationship analysis techniques based on framework transitions, molecular docking, molecular dynamics simulation, ligand-protein binding free energy calculation.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (9)
1. Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generating model and computational chemistry, characterized in that the method comprises the following steps:
s1, constructing a full-length hepatitis B capsid protein structure: acquiring the amino acid sequence of the full-length wild type core capsid protein of the hepatitis B virus, and predicting the dimer structure of the full-length wild type core capsid protein of the HBV;
s2, generating and constructing a candidate small molecule database: training a GENTRL generation model by using the obtained training compound set, and generating a candidate small molecule database by using the pre-trained GENTRL generation model;
s3, constructing and screening a skeleton transition model: preliminary screening is carried out on a candidate micromolecule database by using five principles of quasi drugs, euclidean distance between a database molecule and a target molecule is calculated based on WHALES descriptors, and skeleton transition screening is carried out according to structural similarity;
s4, activity screening and binding mode prediction based on molecular docking: docking the small molecules with HBV capsid protein by utilizing molecular docking software, predicting the combination mode of the small molecules and HBV capsid protein, and screening the small molecules with excellent combination mode;
s5, predicting and screening based on molecular dynamics simulation structure-activity relationship: the stability of the small molecules to HBV capsid protein carbon end domain is analyzed by utilizing molecular dynamics simulation software combined track analysis package, a structure-activity relation model is constructed, the EC50 of the small molecules is predicted, the combined free energy of the small molecules and HBV capsid protein is calculated, and the capsid assembly regulator with anti-HBV activity is screened.
2. The de novo design and virtual screening method of novel hepatitis b virus capsid assembly modifier based on generative model and computational chemistry of claim 1, wherein the obtaining of amino acid sequence of full length wild type core capsid protein of hepatitis b virus and predicting HBV full length wild type core capsid protein dimer structure comprises the steps of:
s11, obtaining the amino acid sequence of the full-length wild type core capsid protein of the hepatitis B virus from NCBI biological information database;
s12, predicting the dimer structure of HBV full-length wild type core capsid protein by using a homomultimer prediction model of Alpha Fold2, and performing energy optimization.
3. The novel model and computational chemistry based de novo design and virtual screening method of hepatitis b virus capsid assembly modulators of claim 1, wherein the training compound set is obtained based on ChEMBL and ZINC databases;
the training data in the training compound set comprises an HBV capsid assembly modifier, a common capsid modifier and an extended connectivity fingerprint of ZINC random molecules;
the training of the GENTRL generation model comprises a variational self-encoder, a hidden space probability distribution, a generator and a reward function based on an SVM classification algorithm.
4. The de novo design and virtual screening method of novel hepatitis b virus capsid assembly regulator based on generation model and computational chemistry according to claim 1, wherein the preliminary screening of candidate small molecule databases using quasi-drug penta, the calculation of euclidean distance between database molecules and target molecules using WHALES descriptors, and the framework transition screening according to structural similarity comprises the steps of:
s31, utilizing the five principles of Li Binsi-based drugs to perform preliminary screening on a candidate small molecule database, and generating a 3D structure for small molecules by using RDkit and OPENBABEL;
s32, calculating a small molecule 3D descriptor by using WHALS, obtaining Euclidean distance between each database molecule and the target compound, and performing skeleton transition according to Euclidean distance sequencing.
5. The de novo design and virtual screening method of novel hepatitis b virus capsid assembly modifiers based on generative model and computational chemistry of claim 1, wherein the docking of small molecules with HBV capsid proteins using molecular docking software comprises:
pretreatment of the receptor using Alpha Fold2 predicted HBV full-length wild-type core capsid protein dimer as the receptor structure using chirea and Maestro;
pretreatment of small molecule ligands using RDKit and OPENBABEL;
and the docking software is SMINA, scoring is carried out on each small molecule according to the affinity, and the first 10 compounds of the scoring rank are taken for carrying out subsequent molecular dynamics simulation screening.
6. The novel de novo design and virtual screening method of hepatitis b virus capsid assembly regulator based on generative model and computational chemistry of claim 1, wherein the analysis of the stability of small molecules to HBV capsid protein carbon end domains by molecular dynamics simulation software in combination with trajectory analysis package comprises the steps of:
preparing a simulation input file by using CHARMM-GUI, simulating 30ns by using CHARMM36 molecular force field and OPENMM software, and generating a 300-frame track file;
converting the interaction of HBV capsid protein and a small molecule into a dcd file comprising a 3D trajectory, wherein the dcd file comprises positions of 300 frames during each atomic simulation of HBV capsid protein and ligand;
calculating a stability index of the carbon end domain of HBV capsid protein by MDtraj reading the dcd file, wherein the stability index is RMSF and RMSD of residues 150-183:
wherein N is the total number of atoms,dical for the ith atom of the current frame and the ith atom of the target frameThe sum of squares of the position offsets of the coordinates, including the sum of squares of the position offsets of the X axis, the Y axis and the Z axis, T is the total analog duration, +.>Cartesian coordinates of atoms at time tj, +.>Is the cartesian coordinates of an atom at the initial moment.
7. The de novo design and virtual screening method of novel hepatitis b virus capsid assembly modifiers based on generative model and computational chemistry according to claim 6 wherein the calculation of stability is based on molecular dynamics simulation systems based on the binding of a pre-large number of HBV capsid proteins to known CAMs, a new mechanism of action of CAMs with HBV capsid proteins is found and is CAMs accelerating capsid assembly by stabilizing HBV capsid protein carbon end domains.
8. The novel design and virtual screening method of hepatitis b virus capsid assembly regulator based on generation model and computational chemistry according to claim 1, wherein the structure-activity relationship model uses the carbon end domain RMSD of ligand-free protein mimetic system and the carbon end domain RMSD of small molecule ligand-binding protein mimetic system for t-test, calculating p-value, and predicting small molecule EC50 through p-value.
9. The novel design from scratch and virtual screening method of hepatitis b virus capsid assembly modifier based on generation model and computational chemistry according to claim 1, wherein the calculation of the binding free energy is based on a dcd file and a simulated input file generated by simulation, and the binding free energy of small molecule ligand and HBV capsid protein is calculated using Parmed and AMBER, compared with the binding free energy of known capsid assembly modifiers, and the final lead compound is screened for biological activity verification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310736846.0A CN116504302B (en) | 2023-06-21 | 2023-06-21 | Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310736846.0A CN116504302B (en) | 2023-06-21 | 2023-06-21 | Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116504302A true CN116504302A (en) | 2023-07-28 |
CN116504302B CN116504302B (en) | 2023-11-17 |
Family
ID=87323355
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310736846.0A Active CN116504302B (en) | 2023-06-21 | 2023-06-21 | Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116504302B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150218182A1 (en) * | 2011-08-02 | 2015-08-06 | Indiana University Research And Technology Corporation | Modulators of virus assembly as antiviral agents |
WO2020255013A1 (en) * | 2019-06-18 | 2020-12-24 | Janssen Sciences Ireland Unlimited Company | Combination of hepatitis b virus (hbv) vaccines and capsid assembly modulators being amide derivatives |
US20210285000A1 (en) * | 2020-03-05 | 2021-09-16 | Janssen Pharmaceuticals, Inc. | Combination therapy for treating hepatitis b virus infection |
CN114317832A (en) * | 2022-01-28 | 2022-04-12 | 徐州医科大学 | Method for detecting HBV core protein allosteric modulator related drug resistance locus |
US20220249647A1 (en) * | 2019-06-18 | 2022-08-11 | Janssen Sciences Ireland Unlimited Company | Combination of hepatitis b virus (hbv) vaccines and dihydropyrimidine derivatives as capsid assembly modulators |
CN115282278A (en) * | 2022-07-13 | 2022-11-04 | 山东大学 | Application of cholesterol regulator as antigen presentation promoter in treatment of hepatitis B |
US20220370447A1 (en) * | 2019-09-20 | 2022-11-24 | Hoffmann-La Roche Inc. | Method of treating hbv infection using a core protein allosteric modulator |
CN115938488A (en) * | 2022-11-28 | 2023-04-07 | 四川大学 | Method for identifying protein allosteric modulator based on deep learning and computational simulation |
-
2023
- 2023-06-21 CN CN202310736846.0A patent/CN116504302B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150218182A1 (en) * | 2011-08-02 | 2015-08-06 | Indiana University Research And Technology Corporation | Modulators of virus assembly as antiviral agents |
WO2020255013A1 (en) * | 2019-06-18 | 2020-12-24 | Janssen Sciences Ireland Unlimited Company | Combination of hepatitis b virus (hbv) vaccines and capsid assembly modulators being amide derivatives |
US20220249647A1 (en) * | 2019-06-18 | 2022-08-11 | Janssen Sciences Ireland Unlimited Company | Combination of hepatitis b virus (hbv) vaccines and dihydropyrimidine derivatives as capsid assembly modulators |
US20220370447A1 (en) * | 2019-09-20 | 2022-11-24 | Hoffmann-La Roche Inc. | Method of treating hbv infection using a core protein allosteric modulator |
US20210285000A1 (en) * | 2020-03-05 | 2021-09-16 | Janssen Pharmaceuticals, Inc. | Combination therapy for treating hepatitis b virus infection |
CN114317832A (en) * | 2022-01-28 | 2022-04-12 | 徐州医科大学 | Method for detecting HBV core protein allosteric modulator related drug resistance locus |
CN115282278A (en) * | 2022-07-13 | 2022-11-04 | 山东大学 | Application of cholesterol regulator as antigen presentation promoter in treatment of hepatitis B |
CN115938488A (en) * | 2022-11-28 | 2023-04-07 | 四川大学 | Method for identifying protein allosteric modulator based on deep learning and computational simulation |
Non-Patent Citations (2)
Title |
---|
KIM, HYEJIN 等: "Current Progress in the Development of Hepatitis B Virus Capsid Assembly Modulators: Chemical Structure, Mode-of-Action and Efficacy", 《MOLECULES》, vol. 26, no. 24, pages 1 - 19 * |
杨璐 等: "乙型肝炎病毒衣壳蛋白装配调节剂研究进展", 《中国药理学通报》, vol. 35, no. 11, pages 1481 - 1487 * |
Also Published As
Publication number | Publication date |
---|---|
CN116504302B (en) | 2023-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Marks et al. | Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction | |
WO2023134063A1 (en) | Comparative learning-based method, apparatus, and device for predicting properties of drug molecule | |
WO1993020525A1 (en) | Method of searching the structure of stable biopolymer-ligand molecule composite | |
JP2004503038A (en) | Method for determining three-dimensional protein structure from primary protein sequence | |
CN115985384A (en) | Target polypeptide design method and system based on reinforcement learning and molecular simulation | |
CN104031118A (en) | Novel affinity peptide ligand of murine polyoma capsomere as well as designing and screening method thereof | |
CN116504302B (en) | Novel hepatitis B virus capsid assembly regulator de novo design and virtual screening method based on generation model and computational chemistry | |
WO2007112110A2 (en) | Forward synthetic synthon generation and its use to identify molecules similar in 3 dimensional shape to pharmaceutical lead compounds | |
US8886505B2 (en) | Method of predicting protein-ligand docking structure based on quantum mechanical scoring | |
CN110289055A (en) | Prediction technique, device, computer equipment and the storage medium of drug targets | |
Shen et al. | zPoseScore model for accurate and robust protein–ligand docking pose scoring in CASP15 | |
Olson et al. | Enhancing sampling of the conformational space near the protein native state | |
Dhakal et al. | Predicting Protein-Ligand Binding Structure Using E (n) Equivariant Graph Neural Networks | |
JPWO2019235567A1 (en) | Protein interaction analyzer and analysis method | |
Bravi | Development and use of machine learning algorithms in vaccine target selection | |
CN114842924A (en) | Optimized de novo drug design method | |
Mishra et al. | Artificial intelligence: a new era in drug discovery | |
CN110428875B (en) | Cytochrome P450 metabolic site prediction method of small molecule drug | |
Wang et al. | SAPocket: Finding pockets on protein surfaces with a focus towards position and voxel channels | |
CN116665807B (en) | Molecular intelligent generation method, device, equipment and medium based on diffusion model | |
KR101273732B1 (en) | Protein-ligand docking method using 3-dimensional molecular alignment | |
CN117174164B (en) | Method for screening lead compounds based on predicted protein-small molecule binding posture | |
Bartuzi et al. | Illuminating the “twilight zone”: advances in difficult protein modeling | |
Le Grand | The application of the genetic algorithm to protein tertiary structure prediction | |
Yuan et al. | A survey of computational methods for protein structure prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |