CN108256284A - A kind of drug virtual screening method - Google Patents

A kind of drug virtual screening method Download PDF

Info

Publication number
CN108256284A
CN108256284A CN201810002901.2A CN201810002901A CN108256284A CN 108256284 A CN108256284 A CN 108256284A CN 201810002901 A CN201810002901 A CN 201810002901A CN 108256284 A CN108256284 A CN 108256284A
Authority
CN
China
Prior art keywords
database
candidate compound
calculate node
locally stored
screening
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810002901.2A
Other languages
Chinese (zh)
Inventor
李家辉
陈品
张曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201810002901.2A priority Critical patent/CN108256284A/en
Publication of CN108256284A publication Critical patent/CN108256284A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/40Searching chemical structures or physicochemical data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/90Programming languages; Computing architectures; Database systems; Data warehousing

Abstract

The present invention relates to a kind of drug virtual screening method, screening object includes multiple candidate compounds, includes the following steps:S1, database is written into the information of all candidate compound molecules;S2, the corresponding record of one of candidate compound molecule is taken out from database;S3, the record of taking-up is stored in being locally stored for calculate node as input file;S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, until all processing of the corresponding record of all candidate compound molecules are completed.The present invention has been transferred to the load for being locally stored in database server, reducing meta data server of calculate node by that will bear, and ensure that stable system performance.

Description

A kind of drug virtual screening method
Technical field
The invention belongs to data management fields, and in particular to a kind of drug virtual screening method.
Background technology
Drug virtual screening refers to during medicament research and development, before bioactivity screening is carried out, on computers Prescreening is carried out to compound molecule, to reduce practical screening compounds number, while lead compound is improved and finds efficiency. During virtual screening, screening sequence needs to analyze a candidate compounds up to a million successively, obtains the compound Scoring.Wherein, some screening sequences can be stored in each candidate small molecule in individual file, defeated as one of them Enter;Meanwhile the appraisal result of screening can be also stored in an independent file.Therefore, an each pair of candidate compound point Son is screened, and at least needs to manage two small documents.
Current drug virtual screening mainly carries out in High Performance Computing Cluster and supercomputer, because of screening sequence It is run in calculate node, relevant compound molecule data file need be stored directly in cluster and supercomputer Globally shared storage file system on, just can guarantee that these files are accessed in each selected calculate node, and complete One drug virtual screening flow needs management to be stored in million small documents in globally shared storage file system.
The globally shared storage file system that present cluster and supercomputer use, such as Lustre file system, not The a large amount of small documents of management are good at, even if candidate Medicine small molecule is divided into multiple groups, each group is screened successively, not only The concurrent scale of screening is limited, the same time can only screen one of which, and even if being grouped, and drug is empty Intending the relevant large amount of small documents of screening still can be stored directly on global file system, and the metadata of file system can be caused to take Business device load too high, causes file system performance to decline to a great extent, influences the operation of cluster and supercomputer.
Invention content
The defects of in order to overcome the prior art, the present invention, which provides one kind, can reduce meta data server load, maintainer A kind of drug virtual screening method that performance of uniting is stablized.
For above-mentioned technical problem, the present invention solves in this way:A kind of drug virtual screening method screens object Including multiple candidate compounds, include the following steps:
S1, database is written into the information of all candidate compound molecules;
S2, the corresponding record of one of candidate compound molecule is taken out from database;
S3, the record of taking-up is stored in being locally stored for calculate node as input file;
S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;
S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;
S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, directly It is completed to all processing of the corresponding record of all candidate compound molecules.
Compared with the prior art, the present invention by way of a record, is written using each candidate compound molecule To database, and in the processing procedure of calculate node, the file of generation is stored in being locally stored for calculate node, avoids Large amount of small documents is preserved in globally shared storage file system, alleviates the burden of meta data server, and phase is locally stored Than being more convenient for extending in meta data server, flexibility is good, does not interfere with High Performance Computing Cluster and supercomputer system also Stability;In addition, analysis result can be specifically inserted into a manner of a field in database in a manner that one records, After these analysis results deposit database, the convenience of these data analysis mining processes can be promoted, such as can be easily Algorithm directly is ranked up to these analysis results, unlike analysis result is first read out just from file in the prior art It can processing.
Further, the step S1 is specially:One is created in the database to include at least index, molecular name and divide Each candidate compound molecule is written to the table or set by the table or set of three fields of minor structure information In.
Compared with the prior art, beneficial effects of the present invention are:By candidate compound molecule and the analysis result to it By the storage of the form of record in the database, file is then converted to when in use to be stored in being locally stored of calculate node, It is stored directly in not as file in the globally shared storage file system of cluster and supercomputer, burden is transferred to The load for being locally stored in database server, reducing meta data server of calculate node, ensure that system performance Stability.
Description of the drawings
Fig. 1 is the flow chart of the method for the present invention.
Specific embodiment
With reference to specific embodiment and attached drawing, the present invention is described in detail.
A kind of drug virtual screening method as shown in Figure 1, screening object includes multiple candidate compounds, including walking as follows Suddenly:
S1, database is written into the information of all candidate compound molecules;
S2, the corresponding record of one of candidate compound molecule is taken out from database;
S3, the record of taking-up is stored in being locally stored for calculate node as input file;
S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;
S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;
S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, directly It is completed to all processing of the corresponding record of all candidate compound molecules.
In specific implementation process, step S1 is:One is created in MongoDB databases and includes at least index, molecule name Claim the table or set with three fields of molecular structure information, candidate compound molecule is taken out from ZINC databases, and pass through energy Each candidate compound molecule is written to the table or set by the software of enough read-write MongoDB databases;Step Suddenly S2 is:One of candidate compound is taken out from MongoDB databases by the software that can read and write MongoDB databases The corresponding record of molecule;Step S3 is:The record of taking-up is stored in being locally stored for calculate node as input file In Ramdisk;Step S4 is:Screening software AutodockVina carries out screening analysis, and analysis result is write to input file Enter to calculate node and Ramdisk is locally stored;Step S5 is:The software of MongoDB databases can be read and write by analysis result It reads from being locally stored in Ramdisk for calculate node, is inserted into MongoDB databases in a manner that one records;Step S6 is:Another is taken out from MongoDB databases by the software that can read and write MongoDB databases does not have processed time Select the corresponding record of compound molecule, return to step S3, until all processing of the corresponding record of all candidate compound molecules are completed.

Claims (2)

1. a kind of drug virtual screening method, screening object includes multiple candidate compounds, which is characterized in that including walking as follows Suddenly:
S1, database is written into the information of all candidate compound molecules;
S2, the corresponding record of one of candidate compound molecule is taken out from database;
S3, the record of taking-up is stored in being locally stored for calculate node as input file;
S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;
S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;
S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, directly It is completed to all processing of the corresponding record of all candidate compound molecules.
2. a kind of drug virtual screening method according to claim 1, which is characterized in that the step S1 is specially: One is created in database including at least index, the table or set of three fields of molecular name and molecular structure information, it will be each Candidate compound molecule is written to as a record in the table or set.
CN201810002901.2A 2018-01-02 2018-01-02 A kind of drug virtual screening method Pending CN108256284A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810002901.2A CN108256284A (en) 2018-01-02 2018-01-02 A kind of drug virtual screening method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810002901.2A CN108256284A (en) 2018-01-02 2018-01-02 A kind of drug virtual screening method

Publications (1)

Publication Number Publication Date
CN108256284A true CN108256284A (en) 2018-07-06

Family

ID=62725921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810002901.2A Pending CN108256284A (en) 2018-01-02 2018-01-02 A kind of drug virtual screening method

Country Status (1)

Country Link
CN (1) CN108256284A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103838830A (en) * 2014-02-18 2014-06-04 广东亿迅科技有限公司 Data management method and system of HBase database
CN104573268A (en) * 2015-01-26 2015-04-29 华东理工大学 Interactive visual aided drug design system and implementing method
CN105653680A (en) * 2015-12-29 2016-06-08 北京农信互联科技有限公司 Method and system for storing data on the basis of document database
CN107346379A (en) * 2016-05-07 2017-11-14 复旦大学 A kind of screening technique using cathepsin D as the micromolecular inhibitor of target spot

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103838830A (en) * 2014-02-18 2014-06-04 广东亿迅科技有限公司 Data management method and system of HBase database
CN104573268A (en) * 2015-01-26 2015-04-29 华东理工大学 Interactive visual aided drug design system and implementing method
CN105653680A (en) * 2015-12-29 2016-06-08 北京农信互联科技有限公司 Method and system for storing data on the basis of document database
CN107346379A (en) * 2016-05-07 2017-11-14 复旦大学 A kind of screening technique using cathepsin D as the micromolecular inhibitor of target spot

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋新蕊 等: "计算机辅助药物筛选平台及应用", 《生物信息学》 *
李丽芬: "桌面化学数据库应用系统_ChemDataBase2的研究和实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN103577440B (en) A kind of data processing method and device in non-relational database
US7853770B2 (en) Storage system, data relocation method thereof, and recording medium that records data relocation program
Litwin Virtual hashing: A dynamically changing hashing
US10180992B2 (en) Atomic updating of graph database index structures
US20200372004A1 (en) Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems
CN107103032B (en) Mass data paging query method for avoiding global sequencing in distributed environment
US20140101167A1 (en) Creation of Inverted Index System, and Data Processing Method and Apparatus
US20170255708A1 (en) Index structures for graph databases
CN107766374B (en) Optimization method and system for storage and reading of massive small files
CN107608773A (en) task concurrent processing method, device and computing device
CN104731896A (en) Data processing method and system
CN106933511B (en) Space data storage organization method and system considering load balance and disk efficiency
Amur et al. Design of a write-optimized data store
CN108763572A (en) A kind of method and apparatus for realizing Apache Solr read and write abruptions
CN107665219A (en) A kind of blog management method and device
Lomet Digital B-trees
KR20160012388A (en) Method and apparatus for fsync system call processing using ordered mode journaling with file unit
CN109271545A (en) A kind of characteristic key method and device, storage medium and computer equipment
CN107402982A (en) Data write-in, data matching method, device and computing device
US20200278980A1 (en) Database processing apparatus, group map file generating method, and recording medium
US9405786B2 (en) System and method for database flow management
CN106484818A (en) A kind of hierarchy clustering method based on Hadoop and HBase
CN108256284A (en) A kind of drug virtual screening method
CN107609038A (en) Data clearing method and device
CN111259201B (en) Data maintenance method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180706

RJ01 Rejection of invention patent application after publication