CN108256284A - A kind of drug virtual screening method - Google Patents
A kind of drug virtual screening method Download PDFInfo
- Publication number
- CN108256284A CN108256284A CN201810002901.2A CN201810002901A CN108256284A CN 108256284 A CN108256284 A CN 108256284A CN 201810002901 A CN201810002901 A CN 201810002901A CN 108256284 A CN108256284 A CN 108256284A
- Authority
- CN
- China
- Prior art keywords
- database
- candidate compound
- calculate node
- locally stored
- screening
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/50—Molecular design, e.g. of drugs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/40—Searching chemical structures or physicochemical data
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/90—Programming languages; Computing architectures; Database systems; Data warehousing
Abstract
The present invention relates to a kind of drug virtual screening method, screening object includes multiple candidate compounds, includes the following steps:S1, database is written into the information of all candidate compound molecules;S2, the corresponding record of one of candidate compound molecule is taken out from database;S3, the record of taking-up is stored in being locally stored for calculate node as input file;S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, until all processing of the corresponding record of all candidate compound molecules are completed.The present invention has been transferred to the load for being locally stored in database server, reducing meta data server of calculate node by that will bear, and ensure that stable system performance.
Description
Technical field
The invention belongs to data management fields, and in particular to a kind of drug virtual screening method.
Background technology
Drug virtual screening refers to during medicament research and development, before bioactivity screening is carried out, on computers
Prescreening is carried out to compound molecule, to reduce practical screening compounds number, while lead compound is improved and finds efficiency.
During virtual screening, screening sequence needs to analyze a candidate compounds up to a million successively, obtains the compound
Scoring.Wherein, some screening sequences can be stored in each candidate small molecule in individual file, defeated as one of them
Enter;Meanwhile the appraisal result of screening can be also stored in an independent file.Therefore, an each pair of candidate compound point
Son is screened, and at least needs to manage two small documents.
Current drug virtual screening mainly carries out in High Performance Computing Cluster and supercomputer, because of screening sequence
It is run in calculate node, relevant compound molecule data file need be stored directly in cluster and supercomputer
Globally shared storage file system on, just can guarantee that these files are accessed in each selected calculate node, and complete
One drug virtual screening flow needs management to be stored in million small documents in globally shared storage file system.
The globally shared storage file system that present cluster and supercomputer use, such as Lustre file system, not
The a large amount of small documents of management are good at, even if candidate Medicine small molecule is divided into multiple groups, each group is screened successively, not only
The concurrent scale of screening is limited, the same time can only screen one of which, and even if being grouped, and drug is empty
Intending the relevant large amount of small documents of screening still can be stored directly on global file system, and the metadata of file system can be caused to take
Business device load too high, causes file system performance to decline to a great extent, influences the operation of cluster and supercomputer.
Invention content
The defects of in order to overcome the prior art, the present invention, which provides one kind, can reduce meta data server load, maintainer
A kind of drug virtual screening method that performance of uniting is stablized.
For above-mentioned technical problem, the present invention solves in this way:A kind of drug virtual screening method screens object
Including multiple candidate compounds, include the following steps:
S1, database is written into the information of all candidate compound molecules;
S2, the corresponding record of one of candidate compound molecule is taken out from database;
S3, the record of taking-up is stored in being locally stored for calculate node as input file;
S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;
S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;
S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, directly
It is completed to all processing of the corresponding record of all candidate compound molecules.
Compared with the prior art, the present invention by way of a record, is written using each candidate compound molecule
To database, and in the processing procedure of calculate node, the file of generation is stored in being locally stored for calculate node, avoids
Large amount of small documents is preserved in globally shared storage file system, alleviates the burden of meta data server, and phase is locally stored
Than being more convenient for extending in meta data server, flexibility is good, does not interfere with High Performance Computing Cluster and supercomputer system also
Stability;In addition, analysis result can be specifically inserted into a manner of a field in database in a manner that one records,
After these analysis results deposit database, the convenience of these data analysis mining processes can be promoted, such as can be easily
Algorithm directly is ranked up to these analysis results, unlike analysis result is first read out just from file in the prior art
It can processing.
Further, the step S1 is specially:One is created in the database to include at least index, molecular name and divide
Each candidate compound molecule is written to the table or set by the table or set of three fields of minor structure information
In.
Compared with the prior art, beneficial effects of the present invention are:By candidate compound molecule and the analysis result to it
By the storage of the form of record in the database, file is then converted to when in use to be stored in being locally stored of calculate node,
It is stored directly in not as file in the globally shared storage file system of cluster and supercomputer, burden is transferred to
The load for being locally stored in database server, reducing meta data server of calculate node, ensure that system performance
Stability.
Description of the drawings
Fig. 1 is the flow chart of the method for the present invention.
Specific embodiment
With reference to specific embodiment and attached drawing, the present invention is described in detail.
A kind of drug virtual screening method as shown in Figure 1, screening object includes multiple candidate compounds, including walking as follows
Suddenly:
S1, database is written into the information of all candidate compound molecules;
S2, the corresponding record of one of candidate compound molecule is taken out from database;
S3, the record of taking-up is stored in being locally stored for calculate node as input file;
S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;
S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;
S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, directly
It is completed to all processing of the corresponding record of all candidate compound molecules.
In specific implementation process, step S1 is:One is created in MongoDB databases and includes at least index, molecule name
Claim the table or set with three fields of molecular structure information, candidate compound molecule is taken out from ZINC databases, and pass through energy
Each candidate compound molecule is written to the table or set by the software of enough read-write MongoDB databases;Step
Suddenly S2 is:One of candidate compound is taken out from MongoDB databases by the software that can read and write MongoDB databases
The corresponding record of molecule;Step S3 is:The record of taking-up is stored in being locally stored for calculate node as input file
In Ramdisk;Step S4 is:Screening software AutodockVina carries out screening analysis, and analysis result is write to input file
Enter to calculate node and Ramdisk is locally stored;Step S5 is:The software of MongoDB databases can be read and write by analysis result
It reads from being locally stored in Ramdisk for calculate node, is inserted into MongoDB databases in a manner that one records;Step
S6 is:Another is taken out from MongoDB databases by the software that can read and write MongoDB databases does not have processed time
Select the corresponding record of compound molecule, return to step S3, until all processing of the corresponding record of all candidate compound molecules are completed.
Claims (2)
1. a kind of drug virtual screening method, screening object includes multiple candidate compounds, which is characterized in that including walking as follows
Suddenly:
S1, database is written into the information of all candidate compound molecules;
S2, the corresponding record of one of candidate compound molecule is taken out from database;
S3, the record of taking-up is stored in being locally stored for calculate node as input file;
S4, screening analysis is carried out to input file, and analysis result is written to being locally stored for calculate node;
S5, middle reading is locally stored from calculate node in analysis result, is inserted into database in a manner that one records;
S6, another is taken out from database there is no a corresponding record of processed candidate compound molecule, return to step S3, directly
It is completed to all processing of the corresponding record of all candidate compound molecules.
2. a kind of drug virtual screening method according to claim 1, which is characterized in that the step S1 is specially:
One is created in database including at least index, the table or set of three fields of molecular name and molecular structure information, it will be each
Candidate compound molecule is written to as a record in the table or set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810002901.2A CN108256284A (en) | 2018-01-02 | 2018-01-02 | A kind of drug virtual screening method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810002901.2A CN108256284A (en) | 2018-01-02 | 2018-01-02 | A kind of drug virtual screening method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108256284A true CN108256284A (en) | 2018-07-06 |
Family
ID=62725921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810002901.2A Pending CN108256284A (en) | 2018-01-02 | 2018-01-02 | A kind of drug virtual screening method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108256284A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103838830A (en) * | 2014-02-18 | 2014-06-04 | 广东亿迅科技有限公司 | Data management method and system of HBase database |
CN104573268A (en) * | 2015-01-26 | 2015-04-29 | 华东理工大学 | Interactive visual aided drug design system and implementing method |
CN105653680A (en) * | 2015-12-29 | 2016-06-08 | 北京农信互联科技有限公司 | Method and system for storing data on the basis of document database |
CN107346379A (en) * | 2016-05-07 | 2017-11-14 | 复旦大学 | A kind of screening technique using cathepsin D as the micromolecular inhibitor of target spot |
-
2018
- 2018-01-02 CN CN201810002901.2A patent/CN108256284A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103838830A (en) * | 2014-02-18 | 2014-06-04 | 广东亿迅科技有限公司 | Data management method and system of HBase database |
CN104573268A (en) * | 2015-01-26 | 2015-04-29 | 华东理工大学 | Interactive visual aided drug design system and implementing method |
CN105653680A (en) * | 2015-12-29 | 2016-06-08 | 北京农信互联科技有限公司 | Method and system for storing data on the basis of document database |
CN107346379A (en) * | 2016-05-07 | 2017-11-14 | 复旦大学 | A kind of screening technique using cathepsin D as the micromolecular inhibitor of target spot |
Non-Patent Citations (2)
Title |
---|
宋新蕊 等: "计算机辅助药物筛选平台及应用", 《生物信息学》 * |
李丽芬: "桌面化学数据库应用系统_ChemDataBase2的研究和实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103577440B (en) | A kind of data processing method and device in non-relational database | |
US7853770B2 (en) | Storage system, data relocation method thereof, and recording medium that records data relocation program | |
Litwin | Virtual hashing: A dynamically changing hashing | |
US10180992B2 (en) | Atomic updating of graph database index structures | |
US20200372004A1 (en) | Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems | |
CN107103032B (en) | Mass data paging query method for avoiding global sequencing in distributed environment | |
US20140101167A1 (en) | Creation of Inverted Index System, and Data Processing Method and Apparatus | |
US20170255708A1 (en) | Index structures for graph databases | |
CN107766374B (en) | Optimization method and system for storage and reading of massive small files | |
CN107608773A (en) | task concurrent processing method, device and computing device | |
CN104731896A (en) | Data processing method and system | |
CN106933511B (en) | Space data storage organization method and system considering load balance and disk efficiency | |
Amur et al. | Design of a write-optimized data store | |
CN108763572A (en) | A kind of method and apparatus for realizing Apache Solr read and write abruptions | |
CN107665219A (en) | A kind of blog management method and device | |
Lomet | Digital B-trees | |
KR20160012388A (en) | Method and apparatus for fsync system call processing using ordered mode journaling with file unit | |
CN109271545A (en) | A kind of characteristic key method and device, storage medium and computer equipment | |
CN107402982A (en) | Data write-in, data matching method, device and computing device | |
US20200278980A1 (en) | Database processing apparatus, group map file generating method, and recording medium | |
US9405786B2 (en) | System and method for database flow management | |
CN106484818A (en) | A kind of hierarchy clustering method based on Hadoop and HBase | |
CN108256284A (en) | A kind of drug virtual screening method | |
CN107609038A (en) | Data clearing method and device | |
CN111259201B (en) | Data maintenance method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180706 |
|
RJ01 | Rejection of invention patent application after publication |