CN115364099B - Antibacterial application of repaglinide and antibacterial activity prediction and structural novelty evaluation method - Google Patents
Antibacterial application of repaglinide and antibacterial activity prediction and structural novelty evaluation method Download PDFInfo
- Publication number
- CN115364099B CN115364099B CN202111293501.XA CN202111293501A CN115364099B CN 115364099 B CN115364099 B CN 115364099B CN 202111293501 A CN202111293501 A CN 202111293501A CN 115364099 B CN115364099 B CN 115364099B
- Authority
- CN
- China
- Prior art keywords
- antibacterial
- repaglinide
- compound
- antibacterial activity
- application
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- FAEKWTJYAYMJKF-QHCPKHFHSA-N GlucoNorm Chemical compound C1=C(C(O)=O)C(OCC)=CC(CC(=O)N[C@@H](CC(C)C)C=2C(=CC=CC=2)N2CCCCC2)=C1 FAEKWTJYAYMJKF-QHCPKHFHSA-N 0.000 title claims abstract description 67
- 229960002354 repaglinide Drugs 0.000 title claims abstract description 67
- 230000000844 anti-bacterial effect Effects 0.000 title abstract description 71
- 238000011156 evaluation Methods 0.000 title abstract description 13
- 239000003814 drug Substances 0.000 claims abstract description 23
- 229940079593 drug Drugs 0.000 claims abstract description 20
- 241000222122 Candida albicans Species 0.000 claims description 5
- 241000588724 Escherichia coli Species 0.000 claims description 5
- 229940095731 candida albicans Drugs 0.000 claims description 5
- 244000063299 Bacillus subtilis Species 0.000 claims description 4
- 235000014469 Bacillus subtilis Nutrition 0.000 claims description 4
- 241000191967 Staphylococcus aureus Species 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 abstract description 57
- 229940124350 antibacterial drug Drugs 0.000 abstract description 23
- 238000011161 development Methods 0.000 abstract description 8
- 231100000053 low toxicity Toxicity 0.000 abstract description 7
- 230000000694 effects Effects 0.000 abstract description 5
- 206010059866 Drug resistance Diseases 0.000 abstract description 4
- 230000001580 bacterial effect Effects 0.000 abstract description 4
- 238000012795 verification Methods 0.000 abstract description 2
- 238000000034 method Methods 0.000 description 20
- 239000003242 anti bacterial agent Substances 0.000 description 11
- 238000007637 random forest analysis Methods 0.000 description 11
- 229940126586 small molecule drug Drugs 0.000 description 7
- 238000012706 support-vector machine Methods 0.000 description 7
- 241000894006 Bacteria Species 0.000 description 6
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 6
- 150000003384 small molecules Chemical class 0.000 description 6
- 241000233866 Fungi Species 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 230000002401 inhibitory effect Effects 0.000 description 4
- 230000005764 inhibitory process Effects 0.000 description 4
- 239000013641 positive control Substances 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 102000004877 Insulin Human genes 0.000 description 3
- 108090001061 Insulin Proteins 0.000 description 3
- 108010026951 Short-Acting Insulin Proteins 0.000 description 3
- 229940123958 Short-acting insulin Drugs 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 206010012601 diabetes mellitus Diseases 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000037213 diet Effects 0.000 description 3
- 235000005911 diet Nutrition 0.000 description 3
- 201000001421 hyperglycemia Diseases 0.000 description 3
- 229940125396 insulin Drugs 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 230000000580 secretagogue effect Effects 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 230000004580 weight loss Effects 0.000 description 3
- 229930182555 Penicillin Natural products 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 230000003115 biocidal effect Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000002790 cross-validation Methods 0.000 description 2
- 230000003013 cytotoxicity Effects 0.000 description 2
- 231100000135 cytotoxicity Toxicity 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 229960004884 fluconazole Drugs 0.000 description 2
- RFHAOTPXVQNOHP-UHFFFAOYSA-N fluconazole Chemical compound C1=NC=NN1CC(C=1C(=CC(F)=CC=1)F)(O)CN1C=NC=N1 RFHAOTPXVQNOHP-UHFFFAOYSA-N 0.000 description 2
- 238000011076 safety test Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000002907 substructure search Methods 0.000 description 2
- 229930186147 Cephalosporin Natural products 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 108010015899 Glycopeptides Proteins 0.000 description 1
- 102000002068 Glycopeptides Human genes 0.000 description 1
- 239000012880 LB liquid culture medium Substances 0.000 description 1
- 108010028921 Lipopeptides Proteins 0.000 description 1
- 206010034133 Pathogen resistance Diseases 0.000 description 1
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 229940126575 aminoglycoside Drugs 0.000 description 1
- 230000000845 anti-microbial effect Effects 0.000 description 1
- 239000004599 antimicrobial Substances 0.000 description 1
- 230000003385 bacteriostatic effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 229940041011 carbapenems Drugs 0.000 description 1
- 229940124587 cephalosporin Drugs 0.000 description 1
- 150000001780 cephalosporins Chemical class 0.000 description 1
- 229960005091 chloramphenicol Drugs 0.000 description 1
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 229960003907 linezolid Drugs 0.000 description 1
- TYZROVQLWOKYKF-ZDUSSCGKSA-N linezolid Chemical compound O=C1O[C@@H](CNC(=O)C)CN1C(C=C1F)=CC=C1N1CCOCC1 TYZROVQLWOKYKF-ZDUSSCGKSA-N 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000003120 macrolide antibiotic agent Substances 0.000 description 1
- 229940041033 macrolides Drugs 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 229940049954 penicillin Drugs 0.000 description 1
- 150000002960 penicillins Chemical class 0.000 description 1
- 150000007660 quinolones Chemical class 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 229940124530 sulfonamide Drugs 0.000 description 1
- 150000003456 sulfonamides Chemical class 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 229940040944 tetracyclines Drugs 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/33—Heterocyclic compounds
- A61K31/395—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins
- A61K31/435—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins having six-membered rings with one nitrogen as the only ring hetero atom
- A61K31/44—Non condensed pyridines; Hydrogenated derivatives thereof
- A61K31/445—Non condensed piperidines, e.g. piperocaine
- A61K31/451—Non condensed piperidines, e.g. piperocaine having a carbocyclic group directly attached to the heterocyclic ring, e.g. glutethimide, meperidine, loperamide, phencyclidine, piminodine
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/04—Antibacterial agents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/60—In silico combinatorial chemistry
- G16C20/64—Screening of libraries
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Theoretical Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Veterinary Medicine (AREA)
- Medical Informatics (AREA)
- Pharmacology & Pharmacy (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Organic Chemistry (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Oncology (AREA)
- Library & Information Science (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- General Chemical & Material Sciences (AREA)
- Communicable Diseases (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The application discloses antibacterial application of repaglinide, an antibacterial activity prediction and a structural novelty evaluation method, and relates to the technical field of medicines, wherein the antibacterial activity prediction and the structural novelty evaluation of a compound are carried out, and the antibacterial activity prediction of the compound specifically comprises the following steps: step 1, collecting high-flux data for finishing antibacterial activity to form antibacterial activity reference data; and 2, generating Dayleight molecular fingerprint characteristics of the reference compound by using Pybel and PyDPI and constructing an activity prediction model. And 3, predicting and evaluating antibacterial activity and structural novelty of the compound to be tested by using the model and fmcsR. And step 4, performing experimental verification on the repaglinide with high potential. The application can provide important thought and guidance for the development of novel antibacterial drugs, and more importantly, provides a novel low-toxicity antibacterial active compound Repaglinide (Repaglinide) to cope with the increasingly serious bacterial drug resistance crisis.
Description
Technical Field
The application relates to the technical field of medicines, in particular to antibacterial application of repaglinide and an antibacterial activity prediction and structural novelty evaluation method.
Background
Antibiotics are the basic stone of the current medicine, and the discovery and use of antibiotics save countless lives. Alexander fleming found penicillin in 1929 and used for the treatment of various infectious diseases, marking the advent of the antibiotic era. During 1960-1970, scientists developed numerous antibacterial drugs and have been widely used in medicine. Due to various factors such as economic benefits, inherent difficulty in developing antibacterial drugs, etc., the development of antibacterial drugs has been seriously delayed in recent decades, and the development speed of new antibacterial drugs has far not kept up with the development speed of bacterial resistance.
The spread of bacterial drug resistance is controlled, and the development of novel antibacterial drugs becomes the consensus of the whole society. However, the development of new antibacterial drugs faces a number of difficulties. First, antibiotic classes have been largely explored in the early stages, and it is now difficult to find new antibacterial agents. The industry is severely starved of antimicrobial agents in research pipelines. How to discover novel antibacterial active compounds from billions of compounds comprising huge chemical structural space is of great importance for antibacterial drug development. However, there is currently no ready-made method and framework to achieve this, which is still a challenging task.
Repaglinide (Repaglinide) is a short acting insulin secretagogue used in type 2 diabetic (non-insulin dependent) patients whose hyperglycemia is not effectively controlled by diet control, weight loss and exercise. At present, repaglinide (Repaglinide) has not been reported to have antibacterial activity. Secondly, there is no method in the prior art which allows the discovery of novel antibacterial active compounds and the experimental verification of these compounds. To this end, we propose a method for predicting antibacterial activity and evaluating structural novelty by which the antibacterial potential of all marketed drugs is systematically explored and evaluated.
Disclosure of Invention
Object of the application
In view of the above, the application aims to provide antibacterial application of repaglinide, antibacterial activity prediction and structural novelty evaluation method, so as to predict antibacterial activity of a compound and evaluate structural novelty of the compound.
(II) technical scheme
In order to achieve the technical aim, the application provides antibacterial application of repaglinide, antibacterial activity prediction and structural novelty evaluation methods:
the method comprises the steps of predicting the antibacterial activity of a compound and evaluating the structural novelty of the compound, wherein the step of predicting the antibacterial activity of the compound specifically comprises the following steps of;
step 1, collecting high-flux data related to antibacterial activity and cytotoxicity, and performing finishing filtration and analysis on the data to form an antibacterial activity benchmark database;
step 2, generating a composition descriptor, a topology descriptor, a molecular connection, a molecular charge descriptor and a Dayleight molecular fingerprint feature of a compound to be predicted by using Pybel and PyDPI;
step 3, selecting the features generated in the step 2 through a feature selection module in the scikit-learn to obtain corresponding features to be detected;
step 4, constructing a support vector machine prediction model and a support random forest prediction model;
step 5, evaluating classification performance of all models constructed in the step 4 by adopting a 5-fold cross validation method for 10 times;
step 6, combining the support vector machine with good prediction performance and the random forest prediction model to form an antibacterial prediction model after the evaluation in the step 5;
and 7, predicting the characteristics to be detected in the step 3 through an antibacterial prediction model, and when both models predict that the compound has antibacterial activity, using the compound as a candidate antibacterial compound.
Preferably, the data collected in step 1 specifically include baseline data sets of antibacterial activity and all marketed small molecule drug data sets;
wherein the baseline data set of antibacterial activity is all antibacterial activity data downloaded from the ChEMBL database;
all the marketed small molecule drug data sets are all marketed small molecule drugs downloaded from the drug bank drug database and their corresponding information.
Preferably, the evaluation in the step 5 is performed by using five indexes of ROC curve, accuracy, precision, recall and F1 Score, and the calculation formula is as follows:
wherein TP is true positive, TN is true negative, FP is false positive, and FN is false negative.
Preferably, in the step 4, the construction of the support vector machine prediction model is specifically implemented by using libsvm27 encapsulated in a Python-based machine learning module library Scikit-learn.
Preferably, in the step 4, the construction of the model for supporting random forest prediction is specifically to train and predict the sample by using a random forest classifier in a machine learning module library Scikit-learn based on Python, so as to construct the model for supporting random forest prediction.
Preferably, the evaluating the structural novelty of the compound specifically includes the following steps:
step 1, calculating the overall structural similarity of a candidate compound and all known antibacterial drugs through Pybel, and measuring through a valley coefficient TC, wherein the calculation formula of the TC value is as follows:
tc=c (i, j)/U (i, j), where C (i, j) represents the number of features in common in the molecular fingerprints of two small molecules i and j and U (i, j) represents the number of features in common in the molecular fingerprints of two small molecules i and j;
step 2, using Pybel to generate FP2 molecular fingerprint and calculating TC value;
and 3, judging whether the calculated TC value is lower than 0.5, if so, the similarity of two small molecules is very low, and the structure of the selected compound is novel.
Preferably, the evaluating the structural novelty of the compound specifically further includes the following steps:
step 1, constructing a substructure library of the effective groups of all known antibacterial drugs on the market, and then utilizing fmcsR to perform substructure search on newly discovered candidate antibacterial compounds;
step 2, if the candidate compound does not contain the active substructure of the known antibacterial agent and the overall similarity is less than 0.5, the compound has structural novelty.
In addition, we provide the use of repaglinide for the manufacture of an antibacterial medicament.
From the above technical scheme, the application has the following beneficial effects:
the application develops a novel method capable of accurately predicting the antibacterial activity of the compound and evaluating the structural novelty through a machine learning method and integrating antibacterial data, which is used for exploring novel antibacterial active compounds. The accuracy of the method exceeds 91%, and hundreds of millions of compound libraries can be rapidly screened. Screening against a drug bank drug database can relocate from all drugs on the market to obtain novel antibacterial drugs. Since marketed drugs have generally passed safety tests, they have low toxicity.
The Repaglinide (Repaglinide) which is predicted and experimentally verified by the application has the following characteristics different from the existing antibacterial drugs besides the inhibition activity, safety and low toxicity characteristics of various bacteria and fungi. Repaglinide (Repaglinide) is chemically distinct from existing marketed antibacterial agents, does not contain the active substructure of known antibacterial agents and has an overall similarity of less than 0.3. Repaglinide (Repaglinide) is a short acting insulin secretagogue for use in type 2 diabetic (non-insulin dependent) patients whose hyperglycemia is not effectively controlled by diet control, weight loss and exercise. At present, repaglinide (Repaglinide) has not been reported to have antibacterial activity. Is expected to be applied to the preparation of medicines for resisting drug-resistant bacteria.
In summary, the application not only can provide important ideas and guidance for the development of novel antibacterial drugs, but also provides a novel low-toxicity antibacterial active compound Repaglinide (Repaglinide) to cope with increasingly serious bacterial drug resistance crisis.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for predicting antibacterial activity of a compound and evaluating structural novelty.
Fig. 2 is a chemical structure diagram of Repaglinide (Repaglinide) provided by the present application.
Fig. 3 is a diagram showing the performance of the support vector machine model and the random forest model in antibacterial activity prediction.
Fig. 4 is a graph showing the results of structural similarity calculation and activity prediction of Repaglinide (Repaglinide) provided by the present application.
Fig. 5 is a diagram of structural novelty evaluation results of Repaglinide (Repaglinide) provided by the present application.
Fig. 6 is the antibacterial activity experimental data of Repaglinide (Repaglinide) provided by the present application.
Detailed Description
The following description is merely exemplary in nature and is not intended to limit the present disclosure, application, or uses. It should be understood that throughout the drawings, the same or similar reference numerals indicate the same or similar parts and features. The drawings merely schematically illustrate the concepts and principles of embodiments of the disclosure and do not necessarily illustrate the specific dimensions and proportions of the various embodiments of the disclosure. Specific details or structures may be shown in exaggerated form in particular figures to illustrate related details or structures of embodiments of the present disclosure.
Referring to fig. 1-6:
example 1
A method for predicting the antibacterial activity of a compound and evaluating the structural novelty of the compound comprises the steps of predicting the antibacterial activity of the compound and evaluating the structural novelty of the compound, wherein the method for predicting the antibacterial activity of the compound specifically comprises the following steps of;
step 1, collecting high-flux data related to antibacterial activity and cytotoxicity, and performing finishing filtration and analysis on the data to form an antibacterial activity benchmark database;
step 2, generating a composition descriptor, a topology descriptor, a molecular connection, a molecular charge descriptor and a Dayleight molecular fingerprint feature of a compound to be predicted by using Pybel and PyDPI;
step 3, selecting the features generated in the step 2 through a feature selection module in the scikit-learn to obtain corresponding features to be detected;
step 4, constructing a support vector machine prediction model and a support random forest prediction model;
step 5, evaluating classification performance of all models constructed in the step 4 by adopting a 5-fold cross validation method for 10 times;
step 6, combining the support vector machine with good prediction performance and the random forest prediction model to form an antibacterial prediction model after the evaluation in the step 5;
and 7, predicting the characteristics to be detected in the step 3 through an antibacterial prediction model, and when both models predict that the compound has antibacterial activity, using the compound as a candidate antibacterial compound.
In addition, the data collected in step 1 specifically include baseline data sets for antibacterial activity and all marketed small molecule drug data sets;
wherein the baseline data set for antibacterial activity is all antibacterial activity data downloaded from the ChEMBL database. Active and inactive compounds are specifically filtered according to the semi-inhibitory concentration (IC 50). An antibacterial compound with an IC50 of less than 1000 nM (10. Mu.M) and an inactive compound with an IC50 of greater than 10000nM (100. Mu.M). The baseline dataset contained 1097 antimicrobial active compounds and 578 inactive compounds in total. Since the number of negative data samples in the data set is smaller than that of positive data, a mode of randomly sampling the positive data set is adopted to acquire balanced data sets with the same number as that of the negative data samples. This process is repeated multiple times, ensuring that the predictive model does not deviate significantly between each repetition.
All the marketed small molecule drug data sets are all marketed small molecule drugs downloaded from the drug bank drug database and their corresponding information. A total of 4196 small molecule drugs on the market were included, of which there were 427 antibacterial drugs.
Specifically, in the evaluation in the step 5, five indexes including ROC curve, accuracy, precision, recall and F1 Score are adopted for evaluation, and the calculation formula is as follows:
wherein TP is true positive, TN is true negative, FP is false positive, and FN is false negative.
In the step 4, the construction of the support vector machine prediction model is specifically performed by using libsvm27 encapsulated in a Python-based machine learning module library Scikit-learn.
Specifically, the area under line (AUC) of the ROC curve is used to select the best model and parameters. And finally, determining the kernel function adopted by the optimal model as 'rbf', wherein the penalty parameter C is 50, and other parameters adopt default settings.
The construction of the support random forest prediction model is specifically to train and predict samples by using a random forest classifier in a machine learning module library Scikit-learn based on Python, so as to construct the support random forest prediction model.
Specifically, the parameters were set as follows: (1) The number of decision trees is 950, and the parameter is selected by using the area under line (AUC) of the ROC curve; (2) other parameters employ default settings.
Example two
The method for predicting antibacterial activity of a compound and evaluating structural novelty comprises the following steps of:
step 1, calculating the overall structural similarity of a candidate compound and all known antibacterial drugs through Pybel, and measuring through a valley coefficient TC, wherein the calculation formula of the TC value is as follows:
tc=c (i, j)/U (i, j), where C (i, j) represents the number of features in common in the molecular fingerprints of two small molecules i and j and U (i, j) represents the number of features in common in the molecular fingerprints of two small molecules i and j;
step 2, using Pybel to generate FP2 molecular fingerprint and calculating TC value;
and 3, judging whether the calculated TC value is lower than 0.5, if so, judging that the similarity of two small molecules is very low, and selecting a compound structure to be novel.
In addition, as a preferable mode, the method for evaluating the structural novelty of the compound specifically comprises the following steps:
step 1, constructing a substructure library of the effective groups of all known antibacterial drugs on the market, and then utilizing fmcsR to perform substructure search on newly discovered candidate antibacterial compounds;
step 2, if the candidate compound does not contain the active substructure of the known antibacterial agent and the overall similarity is less than 0.5, the compound has structural novelty.
Specifically, the substructure library comprises the substructures of the effective groups of all known antibacterial agents on the market, such as sulfonamides, penicillins, cephalosporins, carbapenems, chloramphenicol, tetracyclines, aminoglycosides, macrolides, glycopeptides, quinolones, linezolid, lipopeptides and the like.
The Repaglinide was predicted and evaluated using the above method, resulting in a calculated maximum similarity of 0.38 for Repaglinide (Repaglinide) and the existing 427 antibacterial agents, with an overall average similarity of only 0.20 (see fig. 4). While Repaglinide (Repaglinide) does not contain the active substructure of the common 10 broad classes of antibacterial drugs (see figure 5). Repaglinide is thus structurally novel. It should be noted that an overlay of less than 1 in the table indicates that the substructure is not included.
Example III
The repaglinide is predicted and evaluated by the method for predicting the antibacterial activity of the compound and evaluating the structural novelty, so that the repaglinide has the antibacterial activity and the structural novelty, and the antibacterial activity is verified by experiments, wherein the method comprises the following specific steps:
step 1, 20 mu l of each of escherichia coli, candida albicans, bacillus subtilis and staphylococcus aureus are respectively added into an LB liquid culture medium to be shake-cultured at 37 ℃ until the escherichia coli, candida albicans, bacillus subtilis and staphylococcus aureus are in a cloud form (OD 600 is approximately equal to 0.6). The cloudy fungus solution was centrifuged (5000 rpm,5 min), 100. Mu.l of liquid medium was left, and the fungus was blown and mixed uniformly and spread on the surface of solid LB medium. A sterilized 5mm diameter piece of filter paper was placed on the surface of the medium, and 10. Mu.l of sample Repaglinide (Repaglinide) was added dropwise to the center of the filter paper. After the sample is completely absorbed, the sample is poured into 37 ℃ to be cultured for 10-12 hours, and the formation of the inhibition zone is observed, wherein the formation of the inhibition zone indicates that the antibacterial activity is achieved.
Step 2, according to the same procedure as described above, using Ampicillin and Fluconazole as positive controls, 4. Mu.l of positive control solution was added dropwise to the center of a 5mm diameter filter paper sheet.
Step 3, calculating the bacteriostatic activity value (relative to a positive control) of the sample to be tested according to the following formula:
the activity value (%) = (diameter of the bacteriostasis ring of the sample to be detected/diameter of the positive control bacteriostasis ring) of the sample to be detected) is 100
Step 4, the antibacterial activity of Repaglinide (Repaglinide) was determined as shown in fig. 6. Fig. 6 shows the inhibitory activity of Repaglinide (Repaglinide) against various bacteria and fungi, showing a pronounced inhibitory effect of Repaglinide (Repaglinide) against all of escherichia coli, staphylococcus aureus, bacillus subtilis, candida albicans. Repaglinide (Repaglinide) has an inhibitory activity against escherichia coli of 81.53% of ampicillin and 134.4% of fluconazole against candida albicans.
The above experimental results show that Repaglinide (Repaglinide) has a chemical structure different from that of the antibacterial drugs on the market and has inhibitory activity against various bacteria. Therefore, repaglinide (Repaglinide) is an antibacterial compound with a novel structure and can be applied to the preparation of antibacterial drugs.
For this purpose we provide the use of repaglinide, i.e. for the preparation of an antibacterial drug.
The application develops a novel method capable of accurately predicting the antibacterial activity of the compound and evaluating the structural novelty through a machine learning method and integrating antibacterial data, which is used for exploring novel antibacterial active compounds. The accuracy of the method exceeds 91%, and hundreds of millions of compound libraries can be rapidly screened. Screening against a drug bank drug database can relocate from all drugs on the market to obtain novel antibacterial drugs. Since marketed drugs have generally passed safety tests, they have low toxicity.
The Repaglinide (Repaglinide) which is predicted and experimentally verified by the application has the following characteristics different from the existing antibacterial drugs besides the inhibition activity, safety and low toxicity characteristics of various bacteria and fungi. Repaglinide (Repaglinide) is chemically distinct from existing marketed antibacterial agents, does not contain the active substructure of known antibacterial agents and has an overall similarity of less than 0.3. Repaglinide (Repaglinide) is a short acting insulin secretagogue for use in type 2 diabetic (non-insulin dependent) patients whose hyperglycemia is not effectively controlled by diet control, weight loss and exercise. At present, repaglinide (Repaglinide) has not been reported to have antibacterial activity. Is expected to be applied to the preparation of medicines for resisting drug-resistant bacteria.
In summary, the application not only can provide important ideas and guidance for the development of novel antibacterial drugs, but also provides a novel low-toxicity antibacterial active compound Repaglinide (Repaglinide) to cope with increasingly serious bacterial drug resistance crisis.
The exemplary implementation of the solution proposed by the present disclosure has been described in detail hereinabove with reference to the preferred embodiments, however, it will be understood by those skilled in the art that various modifications and adaptations can be made to the specific embodiments described above and that various combinations of the technical features, structures proposed by the present disclosure can be made without departing from the scope of the present disclosure, which is defined by the appended claims.
Claims (1)
1. The application of repaglinide is characterized in that the repaglinide is used for preparing medicines for resisting escherichia coli, candida albicans, bacillus subtilis and staphylococcus aureus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111293501.XA CN115364099B (en) | 2021-11-03 | 2021-11-03 | Antibacterial application of repaglinide and antibacterial activity prediction and structural novelty evaluation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111293501.XA CN115364099B (en) | 2021-11-03 | 2021-11-03 | Antibacterial application of repaglinide and antibacterial activity prediction and structural novelty evaluation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115364099A CN115364099A (en) | 2022-11-22 |
CN115364099B true CN115364099B (en) | 2023-11-03 |
Family
ID=84060812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111293501.XA Active CN115364099B (en) | 2021-11-03 | 2021-11-03 | Antibacterial application of repaglinide and antibacterial activity prediction and structural novelty evaluation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115364099B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009073843A1 (en) * | 2007-12-06 | 2009-06-11 | Cytotech Labs, Llc | Inhalable compositions having enhanced bioavailability |
CN105769897A (en) * | 2016-03-02 | 2016-07-20 | 卢连伟 | Repaglinide containing drug composition for treating diabetic foot and preparation method thereof |
-
2021
- 2021-11-03 CN CN202111293501.XA patent/CN115364099B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009073843A1 (en) * | 2007-12-06 | 2009-06-11 | Cytotech Labs, Llc | Inhalable compositions having enhanced bioavailability |
CN105769897A (en) * | 2016-03-02 | 2016-07-20 | 卢连伟 | Repaglinide containing drug composition for treating diabetic foot and preparation method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN115364099A (en) | 2022-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Stokes et al. | A deep learning approach to antibiotic discovery | |
Gasch et al. | Single-cell RNA sequencing reveals intrinsic and extrinsic regulatory heterogeneity in yeast responding to stress | |
Knowles et al. | Variability and host density independence in inductions-based estimates of environmental lysogeny | |
Lobell et al. | In silico ADMET traffic lights as a tool for the prioritization of HTS hits | |
Thöming et al. | Parallel evolutionary paths to produce more than one Pseudomonas aeruginosa biofilm phenotype | |
Xue et al. | Database searching for compounds with similar biological activity using short binary bit string representations of molecules | |
Minovski et al. | Quantitative structure–activity relationship study of antitubercular fluoroquinolones | |
Klein et al. | Exploratory search during directed navigation in C. elegans and Drosophila larva | |
Moshawih et al. | Synergy between machine learning and natural products cheminformatics: Application to the lead discovery of anthraquinone derivatives | |
CN115364099B (en) | Antibacterial application of repaglinide and antibacterial activity prediction and structural novelty evaluation method | |
Wolfram et al. | Insights from computational modeling in inflammation and acute rejection in limb transplantation | |
Silva et al. | Computer-aided identification of novel anti-paracoccidioidomycosis compounds | |
Badura et al. | Prediction of the antimicrobial activity of quaternary ammonium salts against Staphylococcus aureus using artificial neural networks | |
Kong et al. | Simulations of stochastic sensing of proteins | |
Dunkern et al. | Virtual and experimental high-throughput screening (HTS) in search of novel inosine 5′-monophosphate dehydrogenase II (IMPDH II) inhibitors | |
Shaebani et al. | Distinct speed and direction memories of migrating dendritic cells diversify their search strategies | |
Shulga et al. | Selection of promising novel fragment sized S. aureus SrtA noncovalent inhibitors based on QSAR and docking modeling studies | |
Avram et al. | ColBioS-FlavRC: A collection of bioselective flavonoids and related compounds filtered from high-throughput screening outcomes | |
Abriata et al. | Will cryo-electron microscopy shift the current paradigm in protein structure prediction? | |
CN103267745B (en) | A kind of endotoxin MIP-SPR chip, preparation method and its usage | |
Losey et al. | Learning leaves a memory trace in motor cortex | |
Huang et al. | Assessment of in vitro and in vivo activities in the National Cancer Institute's anticancer screen with respect to chemical structure, target specificity, and mechanism of action | |
Robinson et al. | Expanded structure–activity studies of lipoxazolidinone antibiotics | |
Laufkötter et al. | Large-Scale Comparison of Alternative Similarity Search Strategies with Varying Chemical Information Contents | |
Liu et al. | Optimization of phage heptapeptide library-screening process for developing inhibitors of the isocitrate lyase homologue from Mycobacterium tuberculosis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |