CN114171131B - Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation - Google Patents

Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation Download PDF

Info

Publication number
CN114171131B
CN114171131B CN202111468255.7A CN202111468255A CN114171131B CN 114171131 B CN114171131 B CN 114171131B CN 202111468255 A CN202111468255 A CN 202111468255A CN 114171131 B CN114171131 B CN 114171131B
Authority
CN
China
Prior art keywords
ring
organic molecule
substituent
preset
flexible
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111468255.7A
Other languages
Chinese (zh)
Other versions
CN114171131A (en
Inventor
周云飞
孙广旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhiyao Technology Co ltd
Original Assignee
Shanghai Zhiyao Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhiyao Technology Co ltd filed Critical Shanghai Zhiyao Technology Co ltd
Priority to CN202111468255.7A priority Critical patent/CN114171131B/en
Publication of CN114171131A publication Critical patent/CN114171131A/en
Application granted granted Critical
Publication of CN114171131B publication Critical patent/CN114171131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E10/00Energy generation through renewable energy sources
    • Y02E10/50Photovoltaic [PV] energy
    • Y02E10/549Organic PV cells

Landscapes

  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

The application relates to a processing method and an identification method for organic molecule ring isomerism, and a method and a device for obtaining organic molecule sample conformation. The processing method comprises the following steps: obtaining a sample conformation of the target organic molecule after determining that the initial molecular structure of the target organic molecule comprises a flexible ring; determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the sample conformation contains a substituent, determining the substituent posture of all the substituents on each flexible ring according to the relative spatial positions of the substituents and the flexible ring where the substituents are located; ring isomerism information of the sample conformation of the target organic molecule is obtained based on the ring shape of the flexible rings each comprising a substituent and the substituent poses of all substituents. The scheme provided by the application can collect and identify the ring heterogeneous information of the flexible ring in the target organic molecule, and reduces time cost and labor cost.

Description

Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation
Technical Field
The application relates to the technical field of ring isomerization, in particular to a processing method and an identification method for organic molecule ring isomerization, and a method and a device for obtaining organic molecule sample conformation.
Background
In organic molecules, the myriad specific patterns of atoms or groups spatially arranged by the rotation of a C-C single bond are referred to as conformations. The flexible rings in the organic molecule can be isomerized into different ring conformations, i.e. ring isomerism.
The ring isomerism of a flexible ring is very important in the research of organic chemical reaction, pharmaceutical mechanism and the like, and taking organic drug molecules as an example, different ring isomerisms can generate specific pharmacological reaction, so that the acquisition of ring isomerism information of the flexible ring is very critical. However, for flexible rings with large size or rings with complex chemical environment, the flexible rings cannot be obtained by simple C-C single bond rotation, and the way of obtaining the ring conformation by manual modeling has high requirements on knowledge background and experience of users, and consumes high time cost. Therefore, how to make an efficient strategy to obtain the ring isomerization information in organic molecules is a problem to be solved at present.
Disclosure of Invention
In order to solve or partially solve the problems in the related art, the application provides a processing method and an identification method for organic molecule ring isomerism, and a method and a device for obtaining organic molecule sample conformation, which can collect and identify ring isomerism information of a flexible ring in an organic molecule, and reduce time cost and labor cost.
In a first aspect, the present application provides a method for treating organic molecule ring isomerization, which comprises:
obtaining a sample conformation of a target organic molecule after determining that an initial molecular structure of the target organic molecule comprises a flexible ring;
determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the sample conformation comprises a substituent, determining the substituent posture of all the substituents on each flexible ring according to the relative spatial positions of the substituent and the flexible ring in which the substituent is positioned;
and obtaining ring heterogeneous information of the sample conformation of the target organic molecule according to the ring shape of each flexible ring containing the substituent and the substituent postures of all the substituents.
In one embodiment, after determining that the initial molecular structure of the target organic molecule comprises a flexible loop, before obtaining the sample conformation of the target organic molecule, the method further comprises:
determining whether the initial molecular structure comprises a flexible ring based on the atomic connectivity information and the bond level information of the initial molecular structure.
In one embodiment, after determining that the initial molecular structure of the target organic molecule comprises a flexible ring, the method further comprises:
and acquiring ring information of the flexible ring, wherein the ring information comprises ring type, existence or nonexistence of a substituent, substituent type and position information of the substituent.
In one embodiment, said obtaining a sample conformation of said target organic molecule comprises:
and according to a preset periodic annealing condition and a preset time interval, obtaining the molecular structures of the molecular system of the target organic molecule in different states in the motion trail simulated by the preset force field, thereby obtaining the sample conformations with preset quantity.
In one embodiment, the determining the ring shape of each flexible ring based on the ring size of each flexible ring and the relative spatial position of each atom within each flexible ring comprises:
respectively acquiring space coordinates corresponding to atoms in the flexible ring;
determining the ring shape of the flexible ring based on the spatial coordinates of each of the atoms.
In one embodiment, said determining the ring shape of said flexible ring from the spatial coordinates of each of said atoms comprises:
selecting any four atoms which are not connected end to end in the flexible ring, and respectively generating corresponding dihedral angles;
determining the included angle of the corresponding dihedral angle according to the space coordinate of the atom;
and determining the ring shape of the flexible ring according to the included angle of each dihedral angle.
In one embodiment, the determining the ring shape of the flexible ring according to the included angle of each dihedral angle includes:
determining a reference surface in the flexible ring according to the included angle;
and determining the ring shape of the flexible ring according to the relative position relation of the atoms outside the reference surface and the reference surface.
In one embodiment, said determining the substituent attitude of all substituents on each flexible ring based on the relative spatial positions of said substituents and the flexible ring in which said substituents are located comprises:
and determining the attitude of the substituent according to the relative position relation between the substituent and the reference surface.
In one embodiment, after obtaining the ring isomerization information of the target organic molecule, the method further comprises:
and optimizing the ring isomerism information through quantum chemical calculation to obtain a stable isomer corresponding to the target organic molecule.
In one embodiment, the method further comprises:
when the flexible ring does not contain a substituent, ring isomerism information of the target organic molecule is obtained according to the ring shape.
A second aspect of the present application provides a method of obtaining a conformation of a sample of organic molecules, comprising:
simulating the motion trail of the target organic molecule in a preset force field according to preset periodic annealing conditions;
and according to a preset time interval, obtaining the molecular structures of the molecular system of the target organic molecules in different states in the motion trail, and obtaining a preset number of sample conformations.
In a third aspect of the present application, there is provided a method for identifying organic molecule ring isomerism, comprising:
obtaining ring information of a flexible ring in a sample conformation of a target organic molecule, wherein the ring information comprises a ring type, a spatial coordinate of an atom in the ring and whether a substituent exists;
determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined by the relative spatial position of the substituent and the flexible ring in which the substituent is located.
The fourth aspect of the present application provides a processing apparatus for organic molecule ring isomerization, which includes:
a conformation acquisition module for acquiring a sample conformation of a target organic molecule after determining that an initial molecular structure of the target organic molecule comprises a flexible ring;
the conformation recognition module is used for determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the sample conformation comprises a substituent, determining the substituent postures of all the substituents on each flexible ring according to the relative spatial positions of the substituent and the flexible ring in which the substituent is positioned;
and the information generation module is used for obtaining the ring heterogeneous information of the sample conformation of the target organic molecule according to the ring shape of each flexible ring containing the substituent and the substituent postures of all the substituents.
In an embodiment, the conformation acquisition module is configured to acquire, according to a preset periodic annealing condition and at a preset time interval, molecular structures of a molecular system of the target organic molecule in different states in a motion trajectory simulated by a preset force field, so as to obtain a preset number of sample conformations.
A fifth aspect of the present application provides an apparatus for obtaining a conformation of an organic molecule sample, comprising:
the simulation module is used for simulating the motion trail of the target organic molecule in a preset force field according to a preset periodic annealing condition;
and the acquisition module is used for acquiring the molecular structures of the molecular system of the target organic molecule in different states in the motion trail according to a preset time interval and acquiring a preset number of sample conformations.
A sixth aspect of the present application provides an apparatus for identifying organic molecule ring isomerism, comprising:
the ring information acquisition module is used for acquiring ring information of a flexible ring in a sample conformation of the target organic molecule, wherein the ring information comprises a ring type, a spatial coordinate of an atom in the ring and whether a substituent exists;
the conformation recognition module is used for determining the ring shape of each flexible ring according to the ring size of the flexible ring and the relative spatial position of each atom in the flexible ring; when any flexible ring in the conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined by the relative spatial position of the substituent and the flexible ring in which the substituent is located.
A seventh aspect of the present application provides an electronic device, comprising:
a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method as described above.
An eighth aspect of the present application provides a computer-readable storage medium having stored thereon executable code, which, when executed by a processor of an electronic device, causes the processor to perform the method as described above.
The technical scheme provided by the application can comprise the following beneficial effects:
according to the technical scheme, the ring shape of the flexible ring is identified and determined according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring by obtaining the sample conformation of the target organic molecule, and the posture of the substituent is identified and determined according to the relative spatial position of the flexible ring where the substituent is located, so that the ring heterogeneous information of the target organic molecule can be obtained. By the design, the sample conformation of the target organic molecule and the corresponding ring isomerism information can be efficiently obtained without manual modeling, the time cost and the labor cost are reduced, and the operation efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The foregoing and other objects, features and advantages of the application will be apparent from the following more particular descriptions of exemplary embodiments of the application as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the application.
FIG. 1 is a schematic flow chart of a method for treating ring isomerization of organic molecules, which is shown in the examples of the present application;
FIG. 2 is a schematic representation of the ring isomeric type of the five-membered ring;
FIG. 3 is another schematic flow diagram of a process for ring isomerization of organic molecules as shown in the examples herein;
FIG. 4 is a schematic representation of the type of ring isomerism of a six membered ring;
FIG. 5 is a schematic flow chart of a method for obtaining a conformation of a sample of organic molecules as shown in an embodiment of the present application;
FIG. 6 is a schematic flow chart of a method for identifying ring isomerism of an organic molecule according to an embodiment of the present application;
FIG. 7 is a schematic structural diagram of an apparatus for processing organic molecule ring isomerization shown in an example of the present application;
FIG. 8 is another schematic structural view of an apparatus for treating organic molecule ring isomerization shown in the example of the present application;
FIG. 9 is a schematic structural diagram of an apparatus for obtaining a conformation of a sample of organic molecules according to an embodiment of the present application;
fig. 10 is a schematic structural view of an identification device for organic molecule ring isomerization shown in an example of the present application;
fig. 11 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While embodiments of the present application are illustrated in the accompanying drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms "first," "second," "third," etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
In the related art, obtaining the loop conformation by artificial modeling for obtaining the loop conformation when the flexible loop is present in the organic molecule has high requirements on knowledge background and experience of a user, and consumes high time cost and labor cost.
In view of the above problems, embodiments of the present application provide a processing method for ring isomerism of an organic molecule, which can collect and identify ring isomerism information of a flexible ring in an organic molecule, reduce time cost and labor cost, and improve work efficiency.
The technical solutions of the embodiments of the present application are described in detail below with reference to the accompanying drawings.
FIG. 1 is a schematic flow chart of a method for treating organic molecule ring isomerization, which is shown in the examples of the present application.
Referring to fig. 1, a method for processing organic molecule ring isomerization in an embodiment of the present application includes:
step S110, after determining that the initial molecular structure of the target organic molecule includes a flexible ring, obtaining a sample conformation of the target organic molecule.
In one embodiment, it is determined whether the initial molecular structure comprises a flexible ring based on the atomic connectivity information and bond level information of the initial molecular structure. It is understood that the connection information and bond level information of each atom in the target organic molecule can be obtained by the initial molecular structure of the target organic molecule, for example, the 3D structure of the target organic molecule. In one embodiment, the 3D structure of the target organic molecule can be obtained from smiles' formula or other structure files, such as xyz, pdb, mol2, mol files, via open source semiochemical kits, such as rdkit, etc.; the chemical informatics kit may also obtain information about the target organic molecule, for example: molecular formula, molecular weight, atomic number, flexible bond number, charge number, hydrogen bond acceptor number, hydrogen bond donor number, chiral center number, N atom number and the like. In a specific embodiment, whether the initial molecular structure contains a ring structure is judged according to the molecular topology of the target organic molecule, namely the connection information of each atom in the target organic molecule; next, when the initial molecular structure contains a ring structure, it is determined whether the ring structure is a flexible ring (for example, it is further determined whether the ring structure contains an aromatic ring) based on bond level information of chemical bonds between atoms located in the ring structure (i.e., it is determined whether the chemical bonds between atoms are single bonds, double bonds, or triple bonds, and atom types are bonded). The flexible ring is a ring structure in which the relative spatial positions of atoms in the ring can be changed without changing the bond, and a new stereoisomer can be generated. It is understood that if the ring structures of the target organic molecule are all aromatic rings, it is determined that (the initial molecular structure of) the target organic molecule does not include a flexible ring therein.
When the initial molecular structure of the target organic molecule is determined to contain the flexible ring, a certain number of sample conformations are obtained from the multiple conformations of the target organic molecule to serve as an image set, so that each sample conformation in the image set can be identified in a subsequent step. Wherein the predetermined number of sample constellations may be obtained according to a random search method, a systematic search method, or other methods. In one embodiment, the threshold of the predetermined number may be adjusted according to the structural complexity of the target organic molecule.
Step S120, determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the sample conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined according to the relative spatial positions of the substituents and the flexible ring in which the substituents are located.
It is understood that each flexible ring may or may not contain substituents, depending on the target organic molecule. From the connection information of each atom in the initial molecular structure of the target organic molecule, it can be determined whether each flexible ring contains a substituent and the positional information of each substituent. Specifically, it can be determined whether the non-ring-inside atom to which the ring-inside carbon atom is attached is a hydrogen atom, and if the non-ring-inside atom is a hydrogen atom, it means that the flexible ring does not contain a substituent, otherwise, the other way around is not. It is understood that the position information of the substituent is determined according to the connection relationship between the substituent and the atom in the flexible ring, and the position information of the substituent is changed correspondingly according to the connected atom objects.
Further, in an embodiment, the spatial coordinates corresponding to the atoms in the flexible ring may be obtained respectively; and determining the ring shape and the substituent attitude of the flexible ring according to the space coordinates of each atom. It will be appreciated that the target organic molecule has corresponding molecular spatial coordinates which consist of the three-dimensional coordinates of each atom of the target organic molecule in space. By obtaining the spatial coordinates of each atom within the flexible ring, the relative spatial position between each atom can be determined. Further, according to the space coordinates of the atoms in the ring, the ring shape of the flexible ring can be identified according to the relevant theoretical knowledge of stereochemistry. In addition, the flexible ring may be a four-membered ring, a five-membered ring or a six-membered ring, a seven-membered ring, an eight-membered ring, etc., depending on the ring size. Different ring sizes have corresponding ring shapes. For the sake of easy understanding, as shown in fig. 2, in the related art, taking the ring shape of a flexible ring having a ring size of a five-membered ring as an example, the five-membered ring generally includes three types of ring shapes of a planar structure, "envelope type", "cross type", and the like. In different types of ring shapes, the relative spatial positions of the atoms within each ring are different from each other. Thus, the ring shape of the flexible ring can be inversely determined according to the relative spatial positions of the atoms within the ring.
Further, the substituent attitude is determined based on the relative spatial position between the atoms of the substituent and the reference plane of the flexible ring. For flexible rings of different sizes, the substituents may be parallel or perpendicular to the reference plane, or the substituents may be on the same or different sides of the reference plane from the atoms outside the reference plane, depending on the spatial coordinates of the atoms of the substituents, to form a multiplicity of substituent poses. It will be appreciated that changes in the attitude of a substituent will directly affect the differences in the ring isomerism information.
In step S130, ring isomerism information of the target organic molecule is obtained according to the ring shape of the flexible rings each containing a substituent and the substituent postures of all the substituents.
After identifying each flexible ring in each sample conformation of the target organic molecule and determining the ring shape and the corresponding substituent attitude of each flexible ring in each sample conformation, the corresponding ring isomerism information of the target organic molecule can be described by combining the ring shape of each flexible ring and all the substituent attitudes of each flexible ring. It is understood that the ring isomerism information of the target organic molecule differs according to the ring shape of each flexible ring and/or the posture of each substituent. After obtaining the ring isomerism information of the target organic molecule, the ring isomerism information determined on the basis of the target organic molecule, for example in screening conformations of organic drug molecules, facilitates the selection of a ring isomerism that corresponds to a specific pharmacological response.
In other embodiments, when the flexible ring does not contain a substituent, then ring isomerism information of the target organic molecule is obtained according to the ring shape.
As can be seen from the above embodiments, according to the technical solution of the present application, by obtaining a sample conformation of a target organic molecule, identifying and determining a ring shape of each flexible ring according to a ring size of each flexible ring and a relative spatial position of each atom in each flexible ring, and identifying and determining a posture of a substituent according to a relative spatial position of the flexible ring in which the substituent is located, ring isomerism information of the target organic molecule can be obtained. By the design, the sample conformation of the target organic molecule and the corresponding ring isomerism information can be efficiently obtained without manual modeling, time cost and labor cost are reduced, and operation efficiency is improved.
FIG. 3 is another schematic flow chart of a method for treating organic molecule ring isomerization, which is shown in the examples of the present application. The algorithm of the processing method for organic molecule ring isomerism of the embodiment may be written in python2.7 programming language, and a user may use the algorithm in a jupyter or linux environment, and smiles (simplified molecule linear input specification) character strings or structure files of target organic molecules, such as xyz, pdb, mol2, mol files, etc., are input in the algorithm, so as to implement automatic acquisition of sample conformation and identification of ring isomerism information according to the following method. It is to be noted that the method of the present application can be applied to determination of the ring isomerization information of four-membered rings, five-membered rings, six-membered rings, seven-membered rings and eight-membered rings.
Referring to fig. 3, the organic molecule ring isomerization processing method of the present application includes:
step S210, generating a corresponding initial molecular structure according to the molecular information of the target organic molecule; determining whether the initial molecular structure contains a flexible ring based on the atomic connection information and the bond level information of the initial molecular structure.
The molecular information of the target organic molecule is obtained according to an open source chemical informatics kit rdkit, wherein the molecular information can comprise molecular formula, molecular weight, smiles character string, atomic number, flexible bond number, charge number, hydrogen bond acceptor number, hydrogen bond donor number, flexible ring number, chiral center number and the like. The molecular information may be stored in a table form to form a molecular information table. The molecular information table can be visualized in jupyter (an interactive notebook), and is convenient for a user to check in real time. Further, smiles strings may be converted into a three-dimensional molecular structure, thereby obtaining an initial molecular structure.
Further, from the initial molecular structure, atomic connectivity information and bond level information can be determined, which in turn determines whether the target organic molecule contains a flexible ring. The specific determination process is already described in step S110, and is not described herein again. If the target organic molecule contains a flexible ring, continuing to perform the subsequent steps; if the target organic molecule does not contain a flexible ring, the process is ended and the subsequent steps are not performed.
Step S220, after determining that the initial molecular structure of the target organic molecule includes a flexible ring, obtaining ring information of the flexible ring, where the ring information includes a ring type, a ring atom number, whether a substituent exists, a substituent type, and position information of the substituent.
It is to be understood that the ring information is used as a basis for determining the ring shape and the attitude of the substituent in subsequent steps. Wherein, the ring type includes ring size, ring number and ring connection type. It will be appreciated that the ring size is determined by the number of carbon atoms, for example the flexible ring may be a four, five or six membered ring, etc. depending on the ring size. The number of rings may include monocyclic, bicyclic and polycyclic rings. The ring connection type includes a spiro ring and a condensed ring, for example, a spiro ring in which one carbon atom is shared between rings, and a condensed ring in which two or more carbon atoms are shared (one chemical bond). Further, the presence or absence of a substituent is determined by determining whether an atom in a non-ring to which a carbon atom in a ring is attached is a hydrogen atom, and if so, not including a substituent, otherwise, the other way around. When a substituent is present, the positional information of the substituent represents the information of the ring-inside atom to which the substituent is attached. The substituent is an element species of the substituent connected to the ring-inside atom, and when the number of the species is more than one, each substituent may be connected to the same ring-inside atom or may be connected to different ring-inside atoms, and each substituent has corresponding position information.
Further, the ring information may further include a sequence number of each atom in the ring, and the sequence number of each atom is mapped to the spatial coordinate of the atom, so that the atom sequence number may be used as an index for acquiring the spatial coordinate of the atom, thereby improving the calculation efficiency in the subsequent steps.
Step S230, according to the preset periodic annealing condition and the preset time interval, obtaining the molecular structures of the molecular system of the target organic molecule in different states in the motion trajectory simulated by the preset force field, thereby obtaining a preset number of sample conformations.
Wherein, a random search algorithm is utilized, and a Molecular Dynamics (MD) method is adopted to simulate the movement of a Molecular system, so as to extract the sample conformation in a system formed by different states of the Molecular system. Wherein, the initial molecular structure can be used as the initial configuration to simulate the movement, and a preset number of molecular structures are extracted in the movement track according to a preset time interval, so as to obtain a preset number of sample conformations. Further, in order to realize energy barrier crossing between different conformations and obtain the molecular structure of the target organic molecule in different states as comprehensively as possible, the preset periodic annealing condition is adopted to simulate and control the motion of a molecular system, so that the motion trail of the corresponding atom is generated. The preset periodic annealing condition may be that the temperature is raised to a first preset temperature every a preset interval duration within a preset time period, and then the temperature is lowered to a second preset temperature, so as to cyclically execute a plurality of preset time periods within a preset simulation duration. That is, by adjusting the temperature variation range and the preset simulation duration, the simulation motion is continued in the preset force field in the form of a single molecule, and the energy barriers between different conformations are crossed to generate a more comprehensive atom motion trajectory, and the instantaneous motion trajectory at each moment has a corresponding molecular structure. The predetermined force field may be an AMBER force field, such as a GAFF2 force field, or another force field, which is not limited herein. It is understood that different molecular structures correspond to different conformations. Furthermore, the atom motion trajectory of the target organic molecule in each frame simulated by the preset force field has a corresponding atom space coordinate, and when the molecule structure is extracted according to a preset time interval, the atom space coordinate of the corresponding frame is extracted, so that the atom space structure datamation sample conformation can be obtained, and the calculation of the relative space position in the subsequent step is facilitated.
In this step, in order to avoid obtaining excessive redundant conformations, in atomic motion trajectories corresponding to different energies generated by a molecular system based on a preset periodic annealing condition, instantaneous trajectories are extracted from the motion trajectories according to a preset time interval and a preset number, so that a preset number of molecular structures are obtained by accumulative extraction, namely, a preset number of sample conformations are obtained.
For convenience of understanding, for example, if the preset simulation duration is 5000ps, and the preset time period is 500ps, 10 preset time periods may be performed, and 10 periodic anneals are performed; the preset number of sample conformations, assuming 1000, can be used to extract the molecular structure (i.e. the sample conformations) from the simulated trajectory at uniformly or non-uniformly preset time intervals. According to the design, a random search method and a molecular dynamics method are combined, and a preset periodic annealing condition is adopted, so that all conformations do not need to be traversed, only a certain number of sample conformations are needed to be obtained, a high hit rate can be ensured, most of sample conformations meeting the requirements can be obtained, the data processing load in the subsequent steps of the system is reduced, and the operation efficiency is improved.
Further, in order to ensure the rationality of the structure of the sample conformation, the structure of the sample conformation can be optimized in a limited number of steps using the semi-empirical method DFTB (sensitivity-Functional light-Binding). It can be known that DFTB is a semi-empirical quantum mechanical program that combines the accuracy of the density functional method (DFT) and the efficiency of the tight bound method (TB), wherein the atomic orbital wave function and the interatomic interaction potential that are used are both obtained based on the result fitting of DMol 3. After the steepest descent method is adopted to optimize the sample conformation, the bond length and the bond angle which are not reasonable in the original structure of the sample conformation can be optimized and become more reasonable. Furthermore, the optimized DFTB energy of each sample conformation can be stored, so that the DFTB energy can be used as one basis for conformation screening.
Step S240, respectively obtaining space coordinates corresponding to atoms in the flexible ring according to the ring-in atom serial numbers in the ring information; and determining the ring shape of the flexible ring according to the ring size in the ring information and the space coordinates of atoms in the flexible ring.
It is understood that the sample conformation of the target organic molecule is a three-dimensional stereo structure, with each atom having corresponding spatial coordinates. The method comprises the steps of respectively obtaining space coordinates corresponding to atoms in a flexible ring by taking the serial numbers of the atoms in the ring in ring information as indexes, and converting judgment of ring isomerism into problems of points and surfaces, lines and surfaces in mathematics based on the theory knowledge of stereochemistry and by combining regularity of various ring isomerism, so that the ring shape of the flexible ring is determined, and identification of different ring isomerisms is realized.
Further, in one embodiment, any four atoms which are not connected end to end in the flexible ring are selected to generate corresponding dihedral angles respectively; determining the included angle of the corresponding dihedral angle according to the space coordinate of the atom; and determining the ring shape of the flexible ring according to the included angle of each dihedral angle. It should be noted that, based on the particularity of the five-membered ring, when the flexible ring is a five-membered ring, the four atoms in the group do not need to be connected end to end, and other multi-membered rings need to implement the above-mentioned selection criteria of unconnected end to end. It will be appreciated that by arbitrarily selecting the atoms that are not connected end to end within the flexible ring, two sets of non-adjacent "bonds" are formed, which in turn form a corresponding number of dihedral angles. According to the calculation method in the solid geometry, the included angle of the corresponding dihedral angle can be calculated and obtained according to the space coordinate of the atom in each dihedral angle. The included angle of the dihedral angle can serve as a measurer. For example, according to the ring information, when the flexible ring is a five-membered ring, it can be known that the ring heterogeneity of the five-membered ring can be classified into three types of "envelope" and "cross". If all 'measurers' formed by atoms in the ring, namely the absolute value of included angle of dihedral angles, are within 10 degrees, the ring shape of the flexible ring of the sample conformation is a plane structure, and if only one included angle of dihedral angles is smaller than 10 degrees, the ring shape of the flexible ring of the sample conformation is 'envelope type'; if the included angles of the dihedral angles are all greater than 10, the loop shape of the flexible loop of the sample conformation is "crossed".
Further, the determination rule of the ring shape is slightly different according to the different ring sizes. In one embodiment, the reference surface in the flexible ring is determined according to the included angle of the dihedral angle, taking a six-membered ring as an example, as distinguished from the five-membered ring; and determining the ring shape of the flexible ring according to the relative position relation of the atoms outside the reference surface and the reference surface. For ease of understanding, the ring shape of the six-membered ring generally includes "chair (chair)", "twist (twist)", "half chair (half-chair)", and "boat (boat)", as shown in fig. 4. After a plurality of dihedral angles are obtained by selecting any four atoms which are not connected end to end, a reference plane is selected by comparing the included angle of each dihedral angle. Wherein the reference plane is the plane closest to the plane in all angles of the dihedral angle, i.e. the plane passing through the flexible ring with the largest number of atoms. Taking a reference surface as a spatial position reference standard of atoms in a ring, if three planes exist in the ring and atoms at two ends are on different sides of the reference surface, the ring shape of the flexible ring of the sample conformation is chair type; the loop shape of the flexible loop of the sample conformation is "twisted" if no plane exists within the loop; if there is one atom in the reference plane, the ring shape of the flexible ring of the sample conformation is "half-chair", and the ring shape of the flexible ring of the remaining sample conformation, excluding the above ring shape, is "boat". In one embodiment, when the flexible loop is a seven-membered or eight-membered loop, the above-described decision rule for a six-membered loop can be consulted to determine the loop shape of the flexible loop in the corresponding sample conformation.
Step S250, when any flexible ring in the sample conformation contains a substituent, determining the posture of the substituent according to the relative position relation between the substituent and the reference surface, the type of the substituent and the position information of the substituent.
It should be noted that, according to the ring information of each flexible ring, if it is determined that a substituent is included in the flexible ring, the present step S250 is performed, otherwise, the present step is not required to be performed. That is, when the target organic molecule includes more than one flexible ring, each flexible ring is selected according to the actual situation of whether the target organic molecule includes a substituent, and the step S250 is performed accordingly.
It is understood that the presence of a substituent disrupts the symmetry of the ring shape, thereby extending a diverse range of substituent attitudes. When only one substituent is available, the attitude of the substituent can be determined according to the relative position relationship between the substituent and the reference surface. Referring to fig. 2 again, taking the ring shape "envelope" in the five-membered ring as an example, taking the reference plane as a reference, comparing the position of the atom deviated from the reference plane with the position of the substituent, if the atom and the substituent are located on the same side of the reference plane, the attitude of the substituent may be "up" side; if the atom and the substituent are located on opposite sides of the reference plane, respectively, the substituent attitude may be "lower down". To quantitatively determine the attitude of a substituent, in one embodiment, the attitude of a substituent may be determined from the positive and negative values of a dihedral angle formed by atoms of the substituent and three atoms in a reference plane. When the dihedral angle is positive, the substituent attitude is "up structure (upper structure)"; when the dihedral angle is negative, the substituent attitude is "down structure". Referring again to FIG. 4, taking the six-membered ring as an example, the substituent attitude includes the substituent being parallel to the reference plane or the substituent being perpendicular to the reference plane. Similarly, the corresponding attitude of the substituent is determined by calculating the included angle of the dihedral angle formed by the atoms of the substituent and the three atoms in the reference plane.
Further, the position information of the substituent in the ring information also affects the ring isomerism information generation in the subsequent step. That is, when a single substituent is attached to different atoms within the ring, in the case of ring shape determination, 3 conformations can also be derived, as shown in FIG. 2, taking the substituent attitude in the five-membered ring as an example of "down structure". When the number of the substituent is more than one, the symmetry of the ring shape is further decreased and the conformational species is increased. For example, for a five-membered ring having only C1 (i.e., a carbon atom having an atomic number of 1 on the ring) as a symmetric element, the total number of isomers of the ring which can be formed is generally 10, i.e., 5 down structures and 5 up structures. Therefore, after the relative position relationship between the substituent and the reference surface is determined, the accurate substituent posture can be obtained according to the position and the type of the substituent, so that the accurate ring isomerism information can be conveniently obtained in the subsequent steps.
And step S260, acquiring ring isomerism information of the target organic molecule according to the ring shape and the substituent posture.
If the flexible ring contains a substituent, the ring shape obtained in step S240 above and the substituent attitude obtained in step S250 are combined to determine the ring isomerism information of the flexible ring in the target organic molecule, it being understood that different substituent attitudes combined with the same ring shape will yield different ring isomerism information. According to the definite ring isomerism information, a user can conveniently and visually identify the ring isomerism derived from the flexible ring of the target organic molecule, so that the user can conveniently screen the ring isomerism meeting the requirement from the sample conformation. The search of the ring isomerism information can be performed on the same target organic molecule, thereby facilitating the user to search for a desired conformation, such as a low energy conformation or a diverse conformation thereof.
In other embodiments, if the flexible ring does not contain a substituent, the ring isomerism information of the target organic molecule is determined directly from the ring shape obtained in step S240 above.
After ring isomerism information is obtained, stability of each ring isomerism is different due to the characteristics of the molecular structure of each ring isomerism. In one embodiment, each ring isomer is optimized by quantum chemical computation to obtain a stable isomer corresponding to the target organic molecule. Taking medicinal organic molecules as an example, the ring isomerism of the sample conformation is optimized by a quantum chemical calculation method to obtain the most stable isomer, so that the stability of the pharmacological property is ensured.
In order to verify the reliability of the organic molecule ring isomerization processing method, common four-membered rings, five-membered rings and six-membered rings are selected, and more than one hundred different target organic molecules are taken as samples to perform the steps, wherein the target organic molecules comprise polycyclic structures such as monocyclic rings, fused rings and spiro rings. Through tests, the hit rate of the obtained ring isomers is higher than 80% in a single ring system, and the hit rate in a multi-ring system reaches 76.9%, namely, the sample conformation collected by the method in the single ring system accounts for 80% of the total ring isomer amount of the target organic molecule, and the sample conformation collected in the multi-ring system accounts for 76.9% of the total ring isomer amount of the target organic molecule. Meanwhile, for each acquired sample conformation, the ring isomerism information of the sample conformation can be accurately identified and obtained. Therefore, the processing method can realize higher hit rate, and can be widely applied to flexible loop conformation collection except for extremely complex flexible loops (such as nine-membered or higher-membered loops), and accurate identification is realized.
It can be seen from this embodiment that, in the organic molecule ring isomerism processing method of the present application, after it is determined that a target organic molecule has a flexible ring, different sample conformations can be collected as comprehensively as possible according to a preset periodic annealing condition, and on the premise of obtaining a sufficient sample conformation, the ring shape and the substituent posture of each sample conformation are identified according to stereochemistry, so that accurate ring isomerism information of each sample conformation is obtained. According to the design, the conformation is obtained without manual modeling and the ring heterogeneous information is not required to be identified manually, the ring heterogeneous information is obtained by automatically acquiring the sample conformation and identifying automatically, the labor cost is reduced, the operation efficiency is improved, the hit rate of the sample conformation can be ensured, and the omission of the sample conformation in the acquisition process is reduced.
Fig. 5 is a schematic flow chart of a method for obtaining a conformation of an organic molecule sample, shown in an example of the present application.
Referring to fig. 5, a method for obtaining a conformation of an organic molecule sample according to an embodiment of the present application includes:
step S310, simulating the motion trail of the target organic molecule in a preset force field according to preset periodic annealing conditions.
In one embodiment, the motion trajectory of the target organic molecule is simulated in a preset force field by a random search algorithm and a molecular dynamics method according to preset periodic annealing conditions. Wherein, the initial molecular structure can be obtained in advance according to the molecular information of the target organic molecule, and the initial molecular structure is used as the initial configuration to simulate the movement. The initial molecular structure can be obtained by referring to the related description of step S210, which is not described herein.
Further, the preset periodic annealing condition may be that the temperature is raised to a first preset temperature every preset interval duration within a preset time period, and then the temperature is lowered to a second preset temperature, so as to cyclically execute a plurality of preset time periods within a preset simulation duration. By designing a preset periodic annealing condition, the atomic motion trajectory of the target organic molecule can span energy barriers between different conformations within a preset simulation duration, and then a more comprehensive motion trajectory is obtained.
Step S320, obtaining molecular structures of the molecular system of the target organic molecule in different states in the motion trajectory according to a preset time interval, and obtaining a preset number of sample conformations.
And extracting the instantaneous motion trail, namely extracting the molecular structure corresponding to the current motion trail, within the preset simulation duration according to the preset time interval, thereby obtaining the sample conformations with the preset number. For the related description of this step, reference may be made to the description of step S230, which is not described herein again.
Compared with a system search algorithm, the method for obtaining the organic molecule sample conformation combines a random search method and a molecule dynamics method, and adopts the preset periodic annealing condition, so that the method only needs to obtain a certain number of sample conformations without traversing all conformations, has lower calculation cost and processing efficiency, has a definite sampling strategy, and can obtain the sample conformation with higher hit rate.
FIG. 6 is a schematic flow chart of a method for identifying ring isomerism of an organic molecule according to an embodiment of the present application;
referring to fig. 6, a method for identifying organic molecule ring isomerism according to an embodiment of the present application includes:
step S410, ring information of each flexible ring in the sample conformation of the target organic molecule is obtained, and the ring information includes a ring type, a spatial coordinate of an atom in the ring, and whether a substituent exists.
This example is based on the recognition of organic molecule ring isomerism when a flexible ring is present in the conformation of the target organic molecule.
The space coordinate of the ring-inside atom can be inquired through the sequence number of the ring-inside atom. For details of this step, reference may be made to the related description of step S230, which is not described herein again.
Step S420, determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any of the flexible rings in the sample conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined based on the relative spatial positions of the substituent and the flexible ring in which the substituent is located.
Furthermore, the space coordinates corresponding to the atoms in the flexible ring can be respectively obtained according to the serial numbers of the atoms in the ring information, so that the relative space positions of the atoms in the ring can be determined. It is understood that different ring sizes have different ring shapes, and in order to narrow the recognition range, the ring shape of the flexible ring is determined based on the ring size in the ring information and the spatial coordinates of the atoms in the flexible ring. For the specific identification process of the ring shape, reference may be made to the related description of step S240, which is not described herein again.
Wherein, the judgment rule of the ring shape is slightly different according to different ring sizes. For example, the determination rules of the ring shapes of the four-membered ring, the five-membered ring and the six-membered ring are different, and the determination rules of the ring shapes of the seven-membered ring and the eight-membered ring refer to the six-membered ring. For the specific determination rule, reference may be made to the related description of step S240, which is not described herein. Further, when any of the flexible rings contains a substituent, the ring information also includes the kind of the substituent and the positional information of the substituent. After determining the ring shape of the flexible ring, if the flexible ring includes a substituent, the ring information is further combined to determine the corresponding posture of the substituent, and the specific determination rule may refer to the related description in step S250, which is not described herein again. If the flexible ring does not contain a substituent, the substituent does not need to be posed.
The identification method for the organic molecule ring isomerism is based on ring information of a flexible ring in a sample conformation of a target organic molecule, the flexible rings with different ring sizes are classified and identified through the difference of the ring sizes of the flexible rings, and judgment of the ring isomerism is converted into the problems of point and plane, line and plane and the like in mathematics by utilizing knowledge of stereochemistry and regularity of various ring conformations based on the relative spatial position of each atom of the flexible ring and the relative spatial position of a substituent in the flexible ring, so that the identification of the ring isomerism of the flexible ring of the organic molecule is realized.
Corresponding to the embodiment of the application function implementation method, the application also provides a processing device for organic molecular ring isomerism, a device for obtaining organic molecular sample conformation, an identification device for organic molecular ring isomerism, electronic equipment and corresponding embodiments.
Fig. 7 is a schematic structural view of a processing apparatus for organic molecule ring isomerization shown in an example of the present application.
Referring to fig. 7, the processing apparatus 10 for organic molecular ring isomerization of the present application includes a conformation acquisition module 110, a conformation recognition module 120, and an information generation module 130, wherein:
the conformation acquisition module 110 is configured to acquire a sample conformation of the target organic molecule after determining that the initial molecular structure of the target organic molecule comprises a flexible loop.
The conformation recognition module 120 is configured to determine a ring shape of each flexible ring according to a ring size of each flexible ring and a relative spatial position of each atom in each flexible ring; when any of the flexible rings in the sample conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined based on the relative spatial positions of the substituent and the flexible ring in which the substituent is located.
The information generating module 130 is configured to obtain ring isomerism information of the sample conformation of the target organic molecule according to the ring shape of the flexible rings each including a substituent and the substituent poses of all substituents.
Further, the conformation acquiring module 110 is configured to acquire molecular structures of the molecular system of the target organic molecule in different states in the motion trajectory simulated by the preset force field according to the preset periodic annealing condition and the preset time interval, so as to obtain a preset number of sample conformations. The conformation recognition module 120 is configured to obtain spatial coordinates corresponding to atoms in the flexible ring; the ring shape of the flexible ring is determined from the spatial coordinates of the atoms.
The conformation recognition module 120 is configured to select any four atoms that are not connected end to end in the flexible ring, and generate corresponding dihedral angles respectively; determining the included angle of the corresponding dihedral angle according to the space coordinate of the atom; and determining the ring shape of the flexible ring according to the included angle of each dihedral angle. Specifically, a reference surface in the flexible ring can be determined according to the included angle; and determining the ring shape of the flexible ring according to the relative position relation of the atoms outside the reference surface and the reference surface. The conformation recognition module 120 is further configured to determine the pose of the substituent according to the relative position relationship between the substituent and the reference surface.
Fig. 8 is another schematic structural view of a processing apparatus for organic molecule ring isomerization shown in the embodiment of the present application.
Further, referring to fig. 8, the processing apparatus for organic molecular ring isomerization of the present application includes a flexible ring identification module 140, a conformation acquisition module 110, an information acquisition module 150, a conformation identification module 120, an information generation module 130, and an optimization module 160. Wherein:
the flexible ring identification module 140 is configured to determine whether the initial molecular structure includes a flexible ring based on the atomic connectivity information and the bond level information of the initial molecular structure. Specifically, the flexible ring recognition module 140 is configured to determine whether the initial molecular structure includes a ring structure according to connection information of each atom in the target organic molecule; when the initial molecular structure contains a ring structure, it is judged whether or not the ring structure is a flexible ring based on bond level information of chemical bonds between atoms located in the ring structure.
The conformation acquiring module 110 is configured to acquire a sample conformation of the target organic molecule after the flexible ring recognition module 140 determines that the initial molecular structure of the target organic molecule includes a flexible ring.
After the flexible ring recognition module 140 determines that the initial molecular structure of the target organic molecule includes a flexible ring, the information acquisition module 150 is configured to acquire ring information of the flexible ring, where the ring information includes a ring type, whether a substituent exists, a substituent type, and position information of the substituent.
The conformation recognition module 120 is configured to determine a ring shape of each flexible ring according to a ring size of each flexible ring and a relative spatial position of each atom in each flexible ring; when any of the flexible rings in the sample conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined based on the relative spatial positions of the substituent and the flexible ring in which the substituent is located.
The information generating module 130 is configured to obtain ring heterogeneous information of the sample conformation of the target organic molecule according to the ring shape of each flexible ring containing a substituent and the posture of the substituent of all substituents determined by the conformation recognition module 120.
The optimization module 160 is configured to optimize the ring isomerization information generated by the information generation module 130 through quantum chemical computation, so as to obtain a stable isomer corresponding to the target organic molecule.
According to the embodiment, the processing device for organic molecular ring isomerism does not need manual modeling to obtain conformation and manual identification of ring isomerism information, reduces labor cost, improves operation efficiency, can ensure hit rate of sample conformation, and reduces omission of sample conformation in an acquisition process.
Fig. 9 is a schematic structural diagram of an apparatus for obtaining a conformation of an organic molecule sample shown in an embodiment of the present application.
Referring to fig. 9, the apparatus 20 for obtaining a conformation of an organic molecule sample of the present application comprises a simulation module 210 and an acquisition module 220, wherein:
the simulation module 210 is configured to simulate a motion trajectory of the target organic molecule in a preset force field according to a preset periodic annealing condition. Further, the simulation module 210 is configured to simulate the motion trajectory of the target organic molecule in a preset force field according to a preset periodic annealing condition by using a molecular dynamics method through a random search algorithm.
The acquisition module 220 is configured to acquire molecular structures of a molecular system of the target organic molecule in different states in the motion trajectory according to a preset time interval, and acquire a preset number of sample conformations. Further, the collecting module 220 is configured to obtain a preset number of sample constellations at preset time intervals within a preset simulation duration.
The device for obtaining the organic molecule sample conformation combines a random search method and a molecular dynamics method, adopts a preset periodic annealing condition, only needs to obtain a certain number of sample conformations without traversing all conformations, has lower calculation cost and processing efficiency, has a definite sampling strategy, and can obtain the sample conformation with higher hit rate.
Fig. 10 is a schematic structural diagram of an identification apparatus for organic molecule ring isomerism shown in an example of the present application.
Referring to fig. 10, the apparatus 30 for identifying organic molecule ring isomerism of the present application includes a ring information acquiring module 310 and a conformation identifying module 320, wherein:
the ring information obtaining module 310 is configured to obtain ring information of each flexible ring in the sample conformation of the target organic molecule, where the ring information includes a ring type, a spatial coordinate of an atom in the ring, and whether a substituent exists.
The conformation recognition module 320 is configured to determine a ring shape of each flexible ring according to a ring size of each flexible ring and a relative spatial position of each atom in each flexible ring; when any flexible ring in the conformation contains a substituent, the substituent attitude of all substituents on each flexible ring is determined by the relative spatial position of the substituent and the flexible ring in which the substituent is located. In one embodiment, the conformation recognition module 320 in the present embodiment may be the same as the conformation recognition module 120 in the organic molecule ring isomerization processing apparatus 10.
The device for identifying the isomerism of the organic molecular ring is used for converting judgment of the isomerism of the ring into problems of points and planes, lines and planes and the like in mathematics by utilizing knowledge of stereochemistry and regularity of various ring conformations based on relative spatial positions of atoms and substituents of the flexible ring in the flexible ring so as to realize identification of different ring isomerism.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 11 is a schematic structural diagram of an electronic device shown in an embodiment of the present application.
Referring to fig. 11, the electronic device 1000 includes a memory 1010 and a processor 1020.
The Processor 1020 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 1010 may include various types of storage units, such as system memory, read Only Memory (ROM), and permanent storage. Wherein the ROM may store static data or instructions that are needed by the processor 1020 or other modules of the computer. The persistent storage device may be a read-write storage device. The persistent storage may be a non-volatile storage device that does not lose stored instructions and data even after the computer is powered off. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the permanent storage may be a removable storage device (e.g., floppy disk, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as a dynamic random access memory. The system memory may store instructions and data that some or all of the processors require at runtime. Further, the memory 1010 may comprise any combination of computer-readable storage media, including various types of semiconductor memory chips (e.g., DRAM, SRAM, SDRAM, flash, programmable read only memory), magnetic and/or optical disks, may also be employed. In some embodiments, memory 1010 may include a removable storage device that is readable and/or writable, such as a Compact Disc (CD), a digital versatile disc read only (e.g., DVD-ROM, dual layer DVD-ROM), a Blu-ray disc read only, an ultra-dense disc, a flash memory card (e.g., SD, min SD, micro-SD, etc.), a magnetic floppy disk, and the like. Computer-readable storage media do not contain carrier waves or transitory electronic signals transmitted by wireless or wired means.
The memory 1010 has stored thereon executable code that, when processed by the processor 1020, may cause the processor 1020 to perform some or all of the methods described above.
Furthermore, the method according to the present application may also be implemented as a computer program or computer program product comprising computer program code instructions for performing some or all of the steps of the above-described method of the present application.
Alternatively, the present application may also be embodied as a computer-readable storage medium (or non-transitory machine-readable storage medium or machine-readable storage medium) having executable code (or a computer program or computer instruction code) stored thereon, which, when executed by a processor of an electronic device (or server, etc.), causes the processor to perform part or all of the various steps of the above-described method according to the present application.
Having described embodiments of the present application, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (13)

1. A method for treating organic molecule ring isomerization is characterized by comprising the following steps:
obtaining a sample conformation of a target organic molecule after determining that an initial molecular structure of the target organic molecule comprises a flexible ring; the obtaining of the sample conformation of the target organic molecule comprises: according to a preset periodic annealing condition and a preset time interval, obtaining molecular structures of a molecular system of the target organic molecule in different states in a motion track simulated by a preset force field, so as to obtain a preset number of sample conformations; the preset periodic annealing condition is that the temperature is raised to a first preset temperature every a preset interval duration within a preset time period, and then the temperature is lowered to a second preset temperature, so that a plurality of preset time periods are circularly executed within a preset simulation duration;
determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the sample conformation comprises a substituent, determining the substituent posture of all the substituents on each flexible ring according to the relative spatial positions of the substituent and the flexible ring in which the substituent is positioned; wherein: respectively acquiring space coordinates corresponding to atoms in the flexible ring; selecting any four atoms which are not connected end to end in the flexible ring, and respectively generating corresponding dihedral angles; determining the included angle of the corresponding dihedral angle according to the space coordinate of the atom; determining a reference surface in the flexible ring according to the included angle; determining the ring shape of the flexible ring according to the relative position relation of atoms outside the reference surface and the reference surface;
obtaining ring isomerism information of the sample conformation of the target organic molecule according to the ring shape of each flexible ring containing the substituent and the substituent postures of all the substituents;
and optimizing the ring isomerism information through quantum chemical calculation to obtain a stable isomer corresponding to the target organic molecule.
2. The method of claim 1, wherein after determining that the initial molecular structure of the target organic molecule comprises a flexible loop, and before obtaining the sample conformation of the target organic molecule, further comprising:
determining whether the initial molecular structure comprises a flexible ring based on the atomic connection information and bond level information of the initial molecular structure.
3. The method of claim 1, wherein after determining that the initial molecular structure of the target organic molecule comprises a flexible ring, further comprising:
and acquiring ring information of the flexible ring, wherein the ring information comprises ring type, existence or nonexistence of a substituent, substituent type and position information of the substituent.
4. The method of claim 1, wherein the flexible ring comprises one or more of a four-membered ring, a five-membered ring, a six-membered ring, a seven-membered ring, and an eight-membered ring.
5. The method of claim 3, wherein the ring information further comprises a ring inside atom number;
the respectively obtaining the spatial coordinates corresponding to the atoms in the flexible ring comprises:
and respectively acquiring the space coordinates corresponding to the atoms in the flexible ring according to the ring-in atom serial numbers in the ring information.
6. The method of claim 1, wherein determining the substituent attitude of all substituents on each flexible ring based on the relative spatial positions of the substituents and the flexible ring on which the substituents are located comprises:
and determining the attitude of the substituent according to the relative position relation between the substituent and the reference surface, the type of the substituent and the position information of the substituent.
7. The method of claim 1, wherein the optimizing the ring isomerization information by quantum chemical computation to obtain the corresponding steady-state isomer of the target organic molecule comprises:
a semi-empirical approach was used to optimize the structure of the sample conformation for a limited number of steps.
8. The method of claim 1, further comprising:
when the flexible ring does not contain a substituent, ring isomerism information of the target organic molecule is obtained according to the ring shape.
9. A method of obtaining a conformation of a sample of organic molecules, comprising:
simulating the motion trail of the target organic molecule in a preset force field according to preset periodic annealing conditions;
according to a preset time interval, obtaining molecular structures of a molecular system of the target organic molecules in different states in the motion trail, and obtaining a preset number of sample conformations;
the preset periodic annealing condition is that the temperature is increased to a first preset temperature every a preset interval duration within a preset time period, and then the temperature is reduced to a second preset temperature, so that a plurality of preset time periods are executed in a circulating manner within a preset simulation duration.
10. An apparatus for processing organic molecule ring isomerization, comprising:
the conformation acquisition module is used for acquiring the sample conformation of the target organic molecule after determining that the initial molecular structure of the target organic molecule comprises a flexible ring; the obtaining of the sample conformation of the target organic molecule comprises: according to a preset periodic annealing condition and a preset time interval, obtaining molecular structures of a molecular system of the target organic molecule in different states in a motion track simulated by a preset force field, so as to obtain a preset number of sample conformations; the preset periodic annealing condition is that the temperature is raised to a first preset temperature every a preset interval duration within a preset time period, and then the temperature is lowered to a second preset temperature, so that a plurality of preset time periods are circularly executed within a preset simulation duration;
the conformation recognition module is used for determining the ring shape of each flexible ring according to the ring size of each flexible ring and the relative spatial position of each atom in each flexible ring; when any flexible ring in the sample conformation comprises a substituent, determining the substituent posture of all the substituents on each flexible ring according to the relative spatial positions of the substituent and the flexible ring in which the substituent is positioned; wherein: respectively acquiring space coordinates corresponding to atoms in the flexible ring; selecting any four atoms which are not connected end to end in the flexible ring, and respectively generating corresponding dihedral angles; determining the included angle of the corresponding dihedral angle according to the space coordinate of the atom; determining a reference surface in the flexible ring according to the included angle; determining the ring shape of the flexible ring according to the relative position relation of atoms outside the reference surface and the reference surface;
an information generating module, configured to obtain ring heterogeneous information of a sample conformation of the target organic molecule according to a ring shape of each of the flexible rings including a substituent and a posture of the substituent of all the substituents;
and the optimization module is used for optimizing the ring heterogeneous information generated by the information generation module through quantum chemical calculation to obtain a stable isomer corresponding to the target organic molecule.
11. An apparatus for obtaining a conformation of a sample of organic molecules, comprising:
the simulation module is used for simulating the motion trail of the target organic molecule in a preset force field according to a preset periodic annealing condition; the preset periodic annealing condition is that the temperature is raised to a first preset temperature every a preset interval duration within a preset time period, and then the temperature is lowered to a second preset temperature, so that a plurality of preset time periods are circularly executed within a preset simulation duration;
and the acquisition module is used for acquiring the molecular structures of the molecular system of the target organic molecule in different states in the motion trail according to a preset time interval and acquiring a preset number of sample conformations.
12. An electronic device, comprising: a processor; and
a memory having executable code stored thereon, which when executed by the processor, causes the processor to perform the method of any one of claims 1-9.
13. A computer-readable storage medium having stored thereon executable code, which when executed by a processor of an electronic device, causes the processor to perform the method of any one of claims 1-9.
CN202111468255.7A 2021-12-03 2021-12-03 Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation Active CN114171131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111468255.7A CN114171131B (en) 2021-12-03 2021-12-03 Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111468255.7A CN114171131B (en) 2021-12-03 2021-12-03 Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation

Publications (2)

Publication Number Publication Date
CN114171131A CN114171131A (en) 2022-03-11
CN114171131B true CN114171131B (en) 2023-04-07

Family

ID=80482853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111468255.7A Active CN114171131B (en) 2021-12-03 2021-12-03 Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation

Country Status (1)

Country Link
CN (1) CN114171131B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7751987B1 (en) * 2000-08-16 2010-07-06 Ramot At Tel-Aviv University Ltd. Method and system for predicting amino acid sequences compatible with a specified three dimensional structure
CN113470756A (en) * 2021-07-05 2021-10-01 中国科学院化学研究所 Method and model for constructing aromatic ring compound and aliphatic compound cross-linking model

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005024696A2 (en) * 2003-09-09 2005-03-17 Irm, Llc Method and system for clustering and rescaling for molecular analysis
CA2766496A1 (en) * 2009-06-24 2010-12-29 Foldyne Technology B. V. Molecular structure analysis and modelling
US20110144966A1 (en) * 2009-11-11 2011-06-16 Goddard Iii William A Methods for prediction of binding poses of a molecule
US11107557B1 (en) * 2017-02-02 2021-08-31 Ajay Naresh Jain Force field based molecular structure and conformer generation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7751987B1 (en) * 2000-08-16 2010-07-06 Ramot At Tel-Aviv University Ltd. Method and system for predicting amino acid sequences compatible with a specified three dimensional structure
CN113470756A (en) * 2021-07-05 2021-10-01 中国科学院化学研究所 Method and model for constructing aromatic ring compound and aliphatic compound cross-linking model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
路慧哲 ; 王明安 ; 王道全.相同取代基α,α'-顺式二取代环十二酮优势构象及其相互转换的理论研究.计算机与应用化学.2005,(011),全文. *

Also Published As

Publication number Publication date
CN114171131A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
Cali et al. GenASM: A high-performance, low-power approximate string matching acceleration framework for genome sequence analysis
Yu et al. Aquarium: an automatic data-processing and experiment information management system for biological macromolecular crystallography beamlines
Su et al. Meta-Storms: efficient search for similar microbial communities based on a novel indexing scheme and similarity score for metagenomic data
Burgess et al. Metabolomics
Jiffriya et al. AntiPlag: Plagiarism detection on electronic submissions of text based assignments
Khan et al. Predictive performance comparison analysis of relational & NoSQL graph databases
Jing et al. Spatial correlation functions and the pairwise peculiar velocity dispersion of galaxies in the Point Source Catalog Redshift Survey: Implications for the galaxy biasing in cold dark matter models
Fu Simulate time-integrated coarse-grained molecular dynamics with geometric machine learning
CN114171131B (en) Organic molecule ring isomerism processing method and identification method, and method and device for obtaining organic molecule sample conformation
Biswas et al. Improved efficiency in cryo-EM secondary structure topology determination from inaccurate data
Tong et al. Efficient spatiotemporal interpolation with spark machine learning
Zhao et al. Progressive learning for neuronal population reconstruction from optical microscopy images
WO2023097635A1 (en) Processing method and identification method for cyclic isomer of organic molecule, and method and apparatus for obtaining sample conformation of organic molecule
Mitreva et al. NoSQL solutions to handle big data
CN105824976A (en) Method and device for optimizing word segmentation banks
Li et al. FACC: a novel finite automaton based on cloud computing for the multiple longest common subsequences search
CN109491904A (en) A kind of automated testing method and device of SparkSQL application program
Siddiquee et al. SeiSMo: Semi-supervised time series motif discovery for seismic signal detection
De Biase et al. BROMOC suite: Monte Carlo/Brownian dynamics suite for studies of ion permeation and DNA transport in biological and artificial pores with effective potentials
CN110378390A (en) A kind of figure classification method of multitask
Suryawanshi et al. Big data mining using map reduce: a survey paper
Gao* et al. Migration and directional change of interstitial clusters in α-Fe: searching for transition states by the dimer method
Zeng et al. Semantic highlight retrieval
Cheah Quality, retrieval and analysis of provenance in large-scale data
Jäkel et al. Architectural implications for exascale based on big data workflow requirements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant