CN111402966B - Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure - Google Patents

Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure Download PDF

Info

Publication number
CN111402966B
CN111402966B CN202010150737.7A CN202010150737A CN111402966B CN 111402966 B CN111402966 B CN 111402966B CN 202010150737 A CN202010150737 A CN 202010150737A CN 111402966 B CN111402966 B CN 111402966B
Authority
CN
China
Prior art keywords
atoms
small molecule
fingerprint
fingerprints
atom
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010150737.7A
Other languages
Chinese (zh)
Other versions
CN111402966A (en
Inventor
季长鸽
单金文
张增辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN202010150737.7A priority Critical patent/CN111402966B/en
Publication of CN111402966A publication Critical patent/CN111402966A/en
Application granted granted Critical
Publication of CN111402966B publication Critical patent/CN111402966B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs

Abstract

The invention discloses a fingerprint design method for describing properties of small molecule fragments based on a small molecule three-dimensional structure, which is characterized in that a method for inputting the small molecule three-dimensional structure and identifying property atoms of small molecules is adopted, the spatial position relation of connecting point atoms and the property atoms is calculated, and two different fingerprints for describing the properties of the small molecule fragments are defined according to different use purposes, wherein one fingerprint is searched through a database, and the other fingerprint is similar through calculation. Compared with the prior art, the method has the advantages that two fingerprints are generated rapidly, a plurality of similar fragments can be found rapidly when the method is applied to the design and modification processes of actual drug molecules, a plurality of new molecules are generated, the pharmaceutical chemist is helped to design and modify the drug molecules rapidly, and particularly, the method has a remarkable effect on avoiding patents.

Description

Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure
Technical Field
The invention relates to the technical field of computer-aided drug design, in particular to a fingerprint design method for describing properties of small molecule fragments based on a small molecule three-dimensional structure.
Background
In the whole process of drug design, modification of small drug molecules in order to avoid patents is one of main works, and accurate and effective modification of the small drug molecules is beneficial to reducing the cost of drug development, shortening the time period and avoiding patents and improving the efficiency. Generally, when a drug chemical researcher designs a small molecule of a drug, the small molecule part is often protected by patents, and the researcher always needs to modify the part of the structure by means of observation of the small molecule structure and combination of personal experience. Or the existing calculation software, such as Schrodinger or LeadIT, can be used for providing an interface, and can be manually modified by combining personal experience. Both of these approaches require excessive reliance on the experience of the researcher, resulting in inefficiencies. When the structure of the small molecule of the drug is modified, if part of the structure of the small molecule is protected by a patent, the modification direction is to find a fragment with similar function but different structure for substitution, and the accurate description of the characteristics of the small molecule fragment is extremely important at the moment, and mainly comprises the three-dimensional structural characteristics and the attribute characteristics of the small molecule. Therefore, when a small molecule fragment capable of being described is searched, the method can better fit the practical use condition, can effectively replace the experience of the traditional researchers, and is beneficial to more purposefully and more efficiently carrying out molecular modification by pharmaceutical chemistry researchers.
The design and modification of the small drug molecules in the prior art depend too much on the experience of researchers, the defect that the patent efficiency is low, the modification of the small drug molecules cannot be accurately and effectively carried out is avoided, the drug development period is long, the cost is high, and the design and modification of the small drug molecules by the pharmaceutical chemistry researchers with purposiveness and high efficiency are greatly influenced.
Disclosure of Invention
The invention aims to design a fingerprint design method for describing the properties of small molecule fragments based on a small molecule three-dimensional structure aiming at the defects of the prior art, adopts a method for inputting the small molecule three-dimensional structure and identifying the property atoms of the small molecules, calculates the spatial position relation of a connecting point atom and the property atom, designs two different fingerprints for describing the properties of the small molecule fragments, can judge the similarity of the two small molecule fragments by calculating the similarity of two groups of fingerprints according to different use scenes, can quickly generate the two fingerprints, can quickly find a plurality of similar fragments and generate a plurality of new molecules by applying the two fingerprints to the design and modification processes of actual drug molecules, effectively helps a drug chemist to accurately, effectively and quickly design and modify the drug molecules, can better meet the actual use condition, and can effectively replace the experience of the traditional researchers, the method is beneficial to more purposeful and efficient molecular modification of pharmaceutical chemistry researchers, further reduces the cost of drug development and the time period, avoids patents and improves the efficiency.
The purpose of the invention is realized as follows: a fingerprint design method for describing properties of small molecule fragments based on a small molecule three-dimensional structure is characterized in that a method for inputting the small molecule three-dimensional structure and identifying property atoms of small molecules is adopted, the spatial position relation of connecting point atoms and the property atoms is calculated, and two different fingerprints describing the properties of the small molecule fragments are designed, wherein the specific calculation comprises the following steps:
step 1: according to the definition of the rdkit on the atom type, finding out all target attribute atoms from the input small molecule three-dimensional structure, and marking as an attribute atom set A; the attribute atoms are two attributes of a hydrogen bond donor or a hydrogen bond acceptor.
Step 2: in order to enlarge data volume, for three-dimensional small molecules, the positions of all hydrogen atoms are regarded as positions capable of being used as connection points, namely, the hydrogen atoms are marked as t atoms, heavy atoms connected with the hydrogen atoms are marked as b atoms, and the three-dimensional small molecules can be divided into two types of fragments according to the principle of actual drug small molecule design, wherein one type can be used as a connection point, and the other type can be used as a plurality of connection points; the plurality of connection points are based on two connection points, so that the fingerprints of the two connection points are defined: in the case where all of the bonding points are combined arbitrarily two by two, two bonding point hydrogen atoms are defined as t1, t2, and the corresponding two heavy atoms are defined as b1, b2, respectively.
And step 3: according to different purposes of use, two types of different fingerprints based on database search purposes and similarity calculation search purposes are respectively designed according to the following conditions of one connection point and a plurality of connection points:
1) a connection point
The first type of fingerprint calculation includes: the li atom is the ith atom in the attribute atom set A, the attribute of the li atom (belonging to a hydrogen bond donor or a hydrogen bond acceptor) is confirmed, the distance between the li atom and the b atom is calculated, and the angle with the b atom as the vertex in a triangle formed by the li atom, the b atom and the t atom is calculated; to describe the fragment space size, the heavy atom farthest from the b atom is found, labeled distance LD.
The second type of fingerprint calculation includes: calculating the distances between all atoms in the attribute atom set A and atoms b, the distances between all atoms in the attribute atom set A and other atoms in the attribute atom set A, and reclassifying the distances into twenty-two-dimensional fingerprints, wherein the distances are larger than 1 and smaller than or equal to 6 and are one dimension every 0.5; the distance is more than 6 and less than or equal to 10.2, and every 0.6 is a dimension; the distance is greater than 10.2, less than or equal to 15.2, and every 1.0 is a dimension.
2) Multiple connection points (defining fingerprint based on two connection points)
The plurality of connection points may define a fingerprint on the basis of two connection points, since in the case of two connection points the two types of fingerprints have a common place, defining a three-dimensional spatial position relationship between the two connection points: euclidean distance of two connecting point heavy atoms b1, b 2; the angles of the heavy atom and the hydrogen atom connected with the two ends are respectively the angles of t1, b1, b2 and t2, b2 and b; two connecting point heavy atoms, two hydrogen atoms, i.e., t1, b1, b2, t 2.
In the two types of fingerprint design of the two connection points, when there is one attribute atom in the first type of fingerprint, the attribute atom and the b1 and b2 atoms are respectively calculated as the fingerprint of one connection point, that is, two fingerprints are calculated and recorded twice, and the second type of fingerprint starts from the similarity of small molecular fragments, and besides the three-dimensional spatial position relationship of the two connection points, the two types of fingerprint design also comprises: a second type of fingerprint computation, like a join point, is computed from the b1 and b2 atoms, respectively, and then merged into a single fingerprint.
3) Centralized processing of two types of fingerprints
And (3) amplifying the distances of all the connection points by five times, and reducing all the angles by 20 times to perform centralized processing of two types of fingerprints, wherein the distances are calculated according to the following formula (1):
Figure BDA0002402346360000031
the angle is calculated according to the following formula (2):
angle=(AB*BC)/(|AB|*|BC|) (2)
wherein: AB is a vector connecting atoms and attribute atoms; BC is the vector of connecting atoms and hydrogen atoms.
Compared with the prior art, the method has the advantages that all systems are calculated, the data size is expanded, the calculated result is stored in a database form, the actual use condition can be better matched when the small molecular fragments can be described, the traditional experience of researchers can be effectively replaced, medicinal chemists are helped to design and modify medicinal molecules, and the important process of avoiding patents has great effect. When a new system appears, molecules similar in appearance and attribute can be quickly found out through similarity calculation, so that medicinal chemistry researchers can be helped to quickly modify medicinal molecules in a targeted manner, and the method is particularly beneficial to directional medicinal molecule modification.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram illustrating attribute atoms and their fingerprint definitions;
FIG. 3 is a schematic diagram of the vertex of the b atom;
FIG. 4 is a schematic diagram of the farthest distance LD;
FIG. 5 is a schematic diagram of the division of twenty two-dimensional distance ranges;
FIG. 6 is a diagram illustrating a second fingerprint definition;
FIG. 7 is a schematic view of a spatial structure fingerprint between two connection points;
FIG. 8 is a diagram of application results of a database search;
fig. 9 is a graph showing the result of application of the similarity calculation.
Detailed Description
The invention divides the small molecule segment into a connection point and a plurality of connection points by identifying the attribute atoms of the small molecule, mainly a hydrogen bond acceptor and a hydrogen bond donor, defines two different fingerprints according to the aim of database search and the aim of small molecule similarity calculation search, designs two different fingerprints describing the attribute of the small molecule segment according to the three-dimensional structure of the small molecule, and concretely comprises the following steps:
step 1: according to the definition of the rdkit on the atom type, all target attribute atoms are found out from the input small molecule three-dimensional structure and are marked as an attribute atom set A; the attribute atoms are two attributes of a hydrogen bond donor or a hydrogen bond acceptor.
Step 2: the positions of all hydrogen atoms on the small molecule are used as the positions of connection points, heavy atoms directly connected with the hydrogen atoms are used as target connection point atoms and are marked as b atoms, the corresponding hydrogen atoms are marked as t atoms, the three-dimensional small molecule is divided into two types of fragments of one connection point and a plurality of connection points, the plurality of connection points are combined in any pairs based on the two connection points, the hydrogen atoms at the two connection points are respectively defined as t1 and t2, and the two corresponding heavy atoms are defined as b1 and b 2.
And step 3: according to different purposes of use, two types of different fingerprints based on database search purposes and similarity calculation search purposes are respectively designed according to the following conditions of one connection point and a plurality of connection points:
1) a connection point
The first type of fingerprint calculation includes: defining a li atom as the ith atom in the attribute atom set A, confirming the attribute of the li atom, namely belonging to a hydrogen bond acceptor or a hydrogen bond donor, calculating the distance between the li atom and the b atom, and the angle which takes the b atom as the vertex in a triangle formed by the li, the b atom and the t atom; to describe the fragment size, the heavy atom farthest from the b atom is found and the distance is marked as LD.
The second type of fingerprint calculation includes: calculating the distances between all atoms in the attribute atom set A and b atoms and the distances between all atoms in A and other atoms in A, and reclassifying the distances into a twenty-two dimensional fingerprint.
2) Multiple connection points
The fingerprints can be defined by taking two connection points as the basis, because the two fingerprints have a common point under the condition of the two connection points, the three-dimensional spatial position relation between the two connection points and the corresponding hydrogen atoms is defined, the Euclidean distance between heavy atoms b1 and b2 of the two connection points and the angles of the heavy atoms and the hydrogen atoms of the two connection points at the two ends are calculated, and the angles are respectively the angles of t1, b1 and b2 and the angles of t2, b2 and b; calculating dihedral angles of two connecting point heavy atoms and two hydrogen atoms, namely the dihedral angles of t1, b1, b2 and t 2; more than two multiple connection points can be extended on the basis of two connection points to define the fingerprint.
3) Centralized processing of two types of fingerprints
And (3) amplifying the distances of all the connection points by five times, and reducing all the angles by 20 times to perform centralized processing of two types of fingerprints, wherein the distances are calculated according to the following formula (1):
Figure BDA0002402346360000051
the angle is calculated according to the following formula (2):
angle=(AB*BC)/(|AB|*|BC|) (2)
wherein: AB is a vector connecting atoms and attribute atoms; BC is the vector of connecting atoms and hydrogen atoms.
The twenty-two-dimensional fingerprint is obtained by taking the distance greater than 1 and less than or equal to 6 as one dimension every 0.5; the distance is more than 6 and less than or equal to 10.2, and every 0.6 is a dimension; the distance is greater than 10.2, less than or equal to 15.2, and every 1.0 is a dimension.
In the two types of fingerprint design of the two connection points, when there is one attribute atom in the first type of fingerprint, the attribute atom and the b1 and b2 atoms are respectively calculated as the fingerprint of one connection point, that is, two fingerprints are calculated and recorded twice, and the second type of fingerprint starts from the similarity of small molecular fragments, and besides the three-dimensional spatial position relationship of the two connection points, the two types of fingerprint design also comprises: the second type of fingerprint computation, like a join point, is computed from the b1 and b2 atoms, respectively, and then merged into one fingerprint.
The present invention will be described in further detail with reference to specific examples.
Example 1
Referring to fig. 1, the invention describes the properties of small molecules according to different purposes by inputting a three-dimensional small molecule file, then identifying the property atoms of the small molecules, and dividing the molecules into the conditions of single connection point and multiple connection points. Designing two different fingerprints describing the small molecule fragment attributes, wherein the specific calculation comprises the following steps:
step 1: referring to fig. 2a, according to the definition of rdkit on atom type, all target attribute atoms are found from the input small molecule three-dimensional structure, which mainly includes two attributes: hydrogen bond donor, hydrogen bond acceptor, and labeled as attribute atom set a.
Step 2: regarding all hydrogen atoms on the small molecules as connection points, distinguishing the situation of one connection point and a plurality of connection points, regarding three-dimensional small molecules, the positions of all the hydrogen atoms are regarded as positions capable of being used as the connection points, the hydrogen atoms are marked as t atoms, heavy atoms connected with the hydrogen atoms are marked as b atoms, and according to the principle of actual drug small molecule design, three-position small molecules can be divided into two types of fragments, one type can be used as one connection point, and the other type can be used as a plurality of connection points; the multiple connecting points can be expanded from two connecting points, the two connecting points are defined as the condition that all one connecting point is combined in any two-to-two mode, hydrogen atoms of the two connecting points are respectively defined as t1 and t2, and two corresponding heavy atoms are respectively defined as b1 and b 2;
and step 3: according to different purposes of use, two types of different fingerprints based on database search purposes and similarity calculation search purposes are respectively designed according to the following conditions of one connection point and a plurality of connection points:
1) a connection point
The first type of fingerprint calculation includes: the li atom is the ith atom in the attribute atom set A, and the attribute of the li atom (belonging to a hydrogen bond donor or a hydrogen bond acceptor) is confirmed.
Referring to FIG. 2b, the distance between li and b atoms is calculated.
Referring to fig. 3, the angle with b as the vertex in the triangle formed by li and b and t atom points is calculated.
Examples of fingerprints specifically calculated in the present invention are shown in table 1 below:
table 1 example of fingerprints
Filename_b Lig_fp
ligand_5 lig_D033015
ligand_6 lig_A026013
ligand_6 lig_D047011
ligand_7 lig_D044009
ligand_8 lig_D047006
ligand_9 lig_A028005
ligand_12 lig_A027010
Referring to FIG. 4, to describe the fragment size, the farthest distance between the b atom and all heavy atoms on the small molecule is shown by LD.
The second type of fingerprint calculation includes: the distances of all atoms in the attribute atom set A and b atoms, and the distances of all atoms in A and other atoms in A are calculated.
Referring to fig. 5, the distance is divided into distances greater than 1 and less than or equal to 6, and every 0.5 is a dimension; the distance is more than 6 and less than or equal to 10.2, and every 0.6 is a dimension; the distance is greater than 10.2 and less than or equal to 15.2, and fingerprints classified as twenty-two dimensions are subdivided every 1.0 dimension.
The fingerprints (fingerprints of the second type) are designed for the purpose of calculating similarity, and the definition fingerprint of each bit is shown in the following table 2:
TABLE 2 custom fingerprint Structure
lndex 00 01 02 03 04 05 06 07 08
Dist (1,1.5] (1.5,2] (2,2.5] (2.5,3] (3,3.5] (3.5,4] (4,4.5] (4.5,5] (5,5,5]
lndex 09 10 11 12 13 14 15 16 17
Dist (5.5,6] (6,6.6] (6.6,7.2] (7.2,7.8] (7.8,8.4] (8.4,9.0] (9.0,9.6] (9.6,10.2] (10.2,11.2]
lndex 18 19 20 21
Dist (11.2,12.2] (12.2,13.2] (13.2,14.2] (14.2,15.2]
Referring to FIG. 6a, the specific steps of the molecule from the selected point of attachment to the fingerprint definition are: in order to set all the fingerprint lengths to be consistent, all cases are prefabricated in advance, the definition of the specific fingerprint is that the final length is 14 × 22, the attribute distance is marked as 1, and the attribute distance is not marked as 0.
Referring to fig. 6b, a 0, 1 fingerprint is shown to more visually illustrate the definition of the entire fingerprint.
2) Multiple connection points
In the case of multiple connection points, the two types of fingerprints have a common denominator, so that multiple connection points can define a fingerprint on the basis of two connection points.
Referring to fig. 7, a three-dimensional spatial positional relationship between two connection points is defined: calculating Euclidean distances of heavy atoms b1 and b2 at two connecting points and angles of the heavy atoms and hydrogen atoms at two connecting points, wherein the angles are the angles of t1, b1, b2, t2, b2 and b; and calculating dihedral angles of two connecting heavy atoms and two hydrogen atoms, namely t1, b1, b2 and t 2.
Specific examples of fingerprints of the present invention are shown in table 3 below:
TABLE 3 example of fingerprints
Filename_b1_b2 Gid
ligand_5_12 030004005002
ligand_5_13 035004007002
ligand_5_18 033002001006
ligand_6_7 006006005009
ligand_6_8 012007007009
ligand_6_12 036007005002
ligand_6_19 043004007015
The first type of fingerprint further comprises: because the three-dimensional spatial position relationship between two connecting points of the small molecule fragment is defined, when there is one attribute atom, the attribute atom and the b1 and b2 atoms are respectively calculated as the fingerprint of the connecting point, namely two fingerprints are recorded; the second type of fingerprint, based on the similarity of the small molecular fragments, includes, in addition to the three-dimensional spatial relationship between the two connecting points: the second type of fingerprints as described above for one junction are computed from the b1 and b2 atoms, respectively, and then merged into one fingerprint.
The following application examples further illustrate the invention in detail:
example 2
First, a first fingerprint and a second fingerprint are used for the CHEMBL database to prepare a database and fingerprints respectively. The pro-ligand in protein 5v3x was then used as an example, although the invention was designed without a protein, but the specific use was made with protein considerations at all.
Referring to fig. 8, in the application of database search, i.e. as the first fingerprint application result, the original ligand molecule of the middle molecule position 5v3x, the shaded part is the part to be modified, the fingerprint of the shaded part is calculated by using the present invention, the database is searched, the segments with similar properties and different structures are found out, the segments are translated and rotated into the protein pocket through the spatial three-dimensional coordinates and are spliced with the rest parts to generate a large number of new molecules, six of which are taken as examples, and the shaded parts of the surrounding six results are the modified parts.
Referring to fig. 9, the application of similarity calculation, i.e. the result of the second fingerprint application, first uses the present invention to calculate the fingerprint of the middle shadow portion, then calculates the similarity between the fingerprint and the fingerprint library, selects the top-ranked segments, i.e. the segments with similar attributes and different structures, and finally displays the result in the figure by the same method as above, where the shadow portions of the surrounding six results are similar structures. It can thus be seen that the present invention has significant success in the application of the design and modification of drug molecules.
The above examples show that these two types of fingerprints have significant success in the application of the design and modification of drug molecules. According to the three-dimensional structure of the small molecules, two different fingerprints describing the properties of the small molecule fragments are calculated. Experiments show that the two types of fingerprints have great effect on the important process of designing and modifying the drug molecules, particularly avoiding patents. The first kind of fingerprints are mainly a method for exhausting all cases, and for a small molecular structure transmitted at one time, attribute atoms are marked, mainly hydrogen bond donors and hydrogen bond acceptors, the condition of connection points is determined, all fingerprints are calculated according to the condition of the connection points, and all cases which may occur are exhausted. The second type of fingerprints mainly adopt a method for describing all characteristics of small molecules as much as possible, including atomic properties, molecular space size and the like, when a new system appears, molecules similar in appearance and properties can be found out quickly through similarity calculation, so that the second type of fingerprints can help pharmaceutical chemists to carry out quick modification on the drug molecules, and directional modification on the drug molecules is facilitated.
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (3)

1. A fingerprint design method for describing properties of small molecule fragments based on a small molecule three-dimensional structure is characterized in that a method for inputting the small molecule three-dimensional structure and identifying property atoms of small molecules is adopted, the spatial position relation of connecting point atoms and the property atoms is calculated, and two different fingerprints describing the properties of the small molecule fragments are designed, wherein the specific calculation comprises the following steps:
step 1: according to the definition of the rdkit on the atom type, finding out all target attribute atoms from the input small molecule three-dimensional structure, and marking as an attribute atom set A; the attribute atoms are two attributes of a hydrogen bond donor or a hydrogen bond acceptor;
and 2, step: taking the positions of all hydrogen atoms on the small molecule as the positions of connecting points, taking heavy atoms directly connected with the hydrogen atoms as target connecting point atoms, marking as b atoms, marking corresponding hydrogen atoms as t atoms, dividing the three-dimensional small molecule into two types of fragments of one connecting point and a plurality of connecting points, wherein the plurality of connecting points are combined in any pairs based on the two connecting points, the two connecting point hydrogen atoms are respectively defined as t1 and t2, and the two corresponding heavy atoms are defined as b1 and b 2;
and step 3: according to different purposes of use, two types of different fingerprints based on database search purposes and similarity calculation search purposes are respectively designed according to the following conditions of one connection point and a plurality of connection points:
1) a connection point
The first type of fingerprint calculation includes: defining li atoms as the ith atom in the attribute atom set A, confirming the attribute of the li atoms, namely belonging to a hydrogen bond acceptor or a hydrogen bond donor, calculating the distance between the li atoms and the b atoms, and calculating the angle with the b atoms as the top point in a triangle formed by the li atoms, the b atoms and the t atoms; finding out the heavy atom farthest from the b atom and marking the distance as LD;
the second type of fingerprint calculation includes: calculating the distances between all atoms in the attribute atom set A and b atoms and the distances between all atoms in A and other atoms in A, and reclassifying the distances into a twenty-two-dimensional fingerprint;
2) multiple connection points
The fingerprints can be defined by taking two connection points as a basis, because the two types of fingerprints of the two connection points are designed with a common point, the three-dimensional spatial position relationship between the two connection points and the corresponding hydrogen atoms is defined, the Euclidean distance between heavy atoms b1 and b2 of the two connection points and the angles of the heavy atoms and the hydrogen atoms of the two connection points are calculated, wherein the angles are the angles of t1, b1 and b2 and the angles of t2, b2 and b respectively; calculating dihedral angles of two connecting point heavy atoms and two hydrogen atoms, namely the dihedral angles of t1, b1, b2 and t 2;
3) centralized processing of two types of fingerprints
And (3) amplifying the distances of all the connection points by five times, and reducing all the angles by 20 times to perform centralized processing of two types of fingerprints, wherein the distances are calculated according to the following formula (1):
Figure FDA0002402346350000021
the angle is calculated according to the following formula (2):
angle=(AB*BC)/(|AB|*|BC|) (2)
wherein: AB is a vector connecting atoms and attribute atoms; BC is the vector of connecting atoms and hydrogen atoms.
2. The method for designing fingerprints for describing the properties of small molecule fragments based on the three-dimensional structure of small molecules as claimed in claim 1, wherein the step 3 of re-dividing the fingerprints into twenty-two dimensions is to classify the fingerprints into one dimension every 0.5, wherein the distance is greater than 1 and less than or equal to 6; the distance is more than 6 and less than or equal to 10.2, and every 0.6 is a dimension; the distance is greater than 10.2, less than or equal to 15.2, and every 1.0 is a dimension.
3. The method for designing fingerprints describing the properties of small molecule fragments based on the three-dimensional structure of small molecules as claimed in claim 1, wherein the two types of fingerprints of the two connecting points in step 3 are designed, wherein when there is one property atom in the first type of fingerprint, the property atom is calculated as a fingerprint of one connecting point with the b1 and b2 atoms, respectively, that is, two fingerprints are calculated and recorded twice, and the second type of fingerprint starting from the similarity of small molecule fragments comprises, in addition to the three-dimensional spatial position relationship of the two connecting points: a second type of fingerprint computation, like a join point, is computed from the b1 and b2 atoms, respectively, and then merged into a single fingerprint.
CN202010150737.7A 2020-03-06 2020-03-06 Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure Active CN111402966B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010150737.7A CN111402966B (en) 2020-03-06 2020-03-06 Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010150737.7A CN111402966B (en) 2020-03-06 2020-03-06 Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure

Publications (2)

Publication Number Publication Date
CN111402966A CN111402966A (en) 2020-07-10
CN111402966B true CN111402966B (en) 2022-08-19

Family

ID=71413220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010150737.7A Active CN111402966B (en) 2020-03-06 2020-03-06 Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure

Country Status (1)

Country Link
CN (1) CN111402966B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1725222A (en) * 2004-07-23 2006-01-25 中国科学院上海药物研究所 Combinatorial chemistry centralized repository design and optimization method
CN102930169A (en) * 2012-11-07 2013-02-13 景德镇陶瓷学院 Method for predicating drug-target combination based on grey theory and molecular fingerprints
CN106777986A (en) * 2016-12-19 2017-05-31 南京邮电大学 Ligand molecular fingerprint generation method based on depth Hash in drug screening
CN107526939A (en) * 2017-06-30 2017-12-29 南京理工大学 A kind of quick small molecule structure alignment schemes
CN108205613A (en) * 2017-12-11 2018-06-26 华南理工大学 The computational methods of similarity and system and their application between a kind of compound molecule
CN109658989A (en) * 2018-11-14 2019-04-19 国网新疆电力有限公司信息通信公司 Class drug compound toxicity prediction method based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1725222A (en) * 2004-07-23 2006-01-25 中国科学院上海药物研究所 Combinatorial chemistry centralized repository design and optimization method
CN102930169A (en) * 2012-11-07 2013-02-13 景德镇陶瓷学院 Method for predicating drug-target combination based on grey theory and molecular fingerprints
CN106777986A (en) * 2016-12-19 2017-05-31 南京邮电大学 Ligand molecular fingerprint generation method based on depth Hash in drug screening
CN107526939A (en) * 2017-06-30 2017-12-29 南京理工大学 A kind of quick small molecule structure alignment schemes
CN108205613A (en) * 2017-12-11 2018-06-26 华南理工大学 The computational methods of similarity and system and their application between a kind of compound molecule
CN109658989A (en) * 2018-11-14 2019-04-19 国网新疆电力有限公司信息通信公司 Class drug compound toxicity prediction method based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MolOpt:A Web Server for Drug Design using Bioisosteric Transformation;Jinwen Shan,Changge Ji;《Current Computer-Aided Drug Design》;20190731;全文 *
基于2D分子指纹的分子相似性方法在虚拟筛选中的应用;唐玉焕,林克江,尤启东;《中国药科大学学报》;20090602;全文 *
基于分子指纹的化学结构相似度检索系统的研究;彭涛,孙连英,刘海波,周家驹;《计算机与应用化学》;20121029;全文 *

Also Published As

Publication number Publication date
CN111402966A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
WO2020034632A1 (en) Mbd-based three-dimensional process designing method and platform for typical automobile machined part
JP3513562B2 (en) Shape analysis system, three-dimensional shape model difference detection system, similar shape search system, shape analysis method, and storage medium
EP1989685B1 (en) A method for comparing a first computer-aided 3d model with a second computer-aided 3d model
US11163916B2 (en) Automatic generation of dimension and tolerance information for fastened components
CN102208033B (en) Data clustering-based robust scale invariant feature transform (SIFT) feature matching method
CN104714986A (en) Three-dimensional picture searching method and three-dimensional picture searching system
CN108389502A (en) A kind of method and device for drawing traffic network
US20080126307A1 (en) Method for recognizing feature of 3D solid model
CN111402966B (en) Fingerprint design method for describing properties of small molecule fragments based on small molecule three-dimensional structure
KR101471603B1 (en) Apparatus and method for performing energy analysis using IFC file
CN115905630A (en) Graph database query method, device, equipment and storage medium
CN113779085A (en) Method and device for acquiring isomorphic subgraph, computer equipment and readable storage medium
CN111326218B (en) Fingerprint design method for describing properties of small molecule fragments based on protein environment
KR100609022B1 (en) Method for image retrieval using spatial relationships and annotation
Wen et al. A 2D engineering drawing and 3D model matching algorithm for process plant
Shunmugam et al. Automatic flat pattern development of sheet metal components from orthographic projections
WO2017159173A1 (en) Analysis model creation assistance device and analysis model creation assistance method
CN103577728B (en) A kind of method using contraction to perform dependency graph identification built-in function
CN116842740B (en) Intelligent identification method for processing characteristics based on primitive model
CN117370592B (en) Part similarity recognition method based on machine learning
Paterson et al. Feature based search of 3D databases
Wang et al. Collision-free regions of tool posture in five-axis machining of blisk with a filleted end mill
Liu et al. Visualization of the geometric transformation group based on the Riemannian metric
CN117197410B (en) Virtual splicing method, device and equipment for steel structure and storage medium
Gancheva et al. An Algorithm for Pairwise DNA Sequences Alignment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant