CN113140262B - Chemical molecule synthesis simulation method and device - Google Patents
Chemical molecule synthesis simulation method and device Download PDFInfo
- Publication number
- CN113140262B CN113140262B CN202110448408.5A CN202110448408A CN113140262B CN 113140262 B CN113140262 B CN 113140262B CN 202110448408 A CN202110448408 A CN 202110448408A CN 113140262 B CN113140262 B CN 113140262B
- Authority
- CN
- China
- Prior art keywords
- shape
- molecules
- splicing
- spliced
- molecule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 37
- 238000003786 synthesis reaction Methods 0.000 title claims abstract description 37
- 238000004088 simulation Methods 0.000 title claims abstract description 31
- 239000000126 substance Substances 0.000 title claims abstract description 28
- 238000005070 sampling Methods 0.000 claims abstract description 47
- 238000006243 chemical reaction Methods 0.000 claims abstract description 24
- 238000004590 computer program Methods 0.000 claims description 16
- 239000002131 composite material Substances 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 11
- 230000002194 synthesizing effect Effects 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 3
- 230000001788 irregular Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000035484 reaction time Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/10—Analysis or design of chemical reactions, syntheses or processes
Landscapes
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Crystallography & Structural Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a chemical molecule synthesis simulation method and a device, and the method comprises the following steps: judging whether the selected splicing molecules can be spliced with the synthetic shape synthesized by the initial shape and the guide molecules aiming at each input combination of the sampling; the spliced molecules can be used as a new initial shape; selecting splicing molecules, and judging whether the selected splicing molecules can be spliced with the new initial shape and the new synthetic shape synthesized by the guide molecules; if the splicing molecules cannot be selected again, judging whether the splicing molecules can be spliced or not, and determining that the synthetic shape is a limited shape or an infinite shape until all the continuously selected n splicing molecules cannot be spliced; and when the number of the input combinations obtained by sampling reaches a sampling number threshold value, calculating the ratio of the number of all the limited shapes to the sampling number threshold value, and determining the ratio as the ratio of the number of the limited shapes to the number of the input combinations in the actual chemical reaction. The invention can reduce the time cost brought by the actual chemical reaction.
Description
Technical Field
The invention relates to the technical field of chemical molecule synthesis, in particular to a chemical molecule synthesis simulation method and device.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
In chemical molecule synthesis, a plurality of molecules with similar structures exist, which are called basic molecules, the basic molecules can be combined in a specific rule, meanwhile, the existence of a weak connection mode between the molecules is considered, a certain degree of irregular connection between the molecules is tolerated, for example, between two shapes consisting of the molecules, most of a connection region is connected according to the rule, only a small part is not connected according to the rule, but the connection function between the macromolecules is very tight, so that the existence of the part of a conflict region is tolerated. In addition, there are also molecules that are relatively small, regardless of shape, called guide molecules, that can block or alter the binding relationships between the underlying molecules. When a portion of the base molecule and the guide molecule are placed in the same environment, one or more finite shapes or infinite shapes may eventually be synthesized (i.e., may be synthesized indefinitely in a regular or irregular manner).
One of the objectives of research on chemical molecular synthesis is to know the total combination of known base and guide molecular species (i.e., the total subset of synthetic molecular species), and the number of finite and infinite shapes that can ultimately be synthesized in the ideal case of sufficient numbers of individual molecules and sufficiently long reaction times.
However, in actual chemical reactions, the synthesis of molecules is often slow, often taking days or even months to reach the desired reaction. And, the larger the shape and the larger the number of the finally formed molecules, the longer the time required; the larger the molecular shape, the smaller the probability that can ultimately be observed. Thus, researchers often need to spend a long time and much effort to observe the molecular synthesis process, and the experiment efficiency is low.
Disclosure of Invention
The embodiment of the invention provides a chemical molecule synthesis simulation method, which is used for quickly and well estimating the number of shapes which can be finally synthesized by different types of basic molecules and guide molecules and greatly reducing the time cost brought by actual chemical reaction, and comprises the following steps:
before synthesizing molecules each time, randomly sampling in preset types of basic molecules and guide molecules respectively to obtain an input combination consisting of one or more basic molecules and zero to more guide molecules;
for each input combination, randomly selecting a base molecule from the input combination as an initial shape, for which the following method is performed:
traversing the guide molecules in the input combination, determining the guide molecules which can be spliced at each splicing position of the initial shape, and splicing all the guide molecules which can be spliced with the initial shape to obtain a synthetic shape;
randomly selecting a basic molecule from the input combination as a splicing molecule;
judging whether the splicing molecules can be spliced with at least one splicing position in all splicing positions of the synthesized shape;
if so, randomly selecting one splicing position from the at least one splicing position, and splicing the splicing molecules with the synthetic shape; the spliced molecules are used as a new initial shape, and the method is executed again;
if not, selecting splicing molecules from the input combination again, and judging whether the splicing molecules can be spliced with the current synthetic shape or not, and determining that the current synthetic shape is a limited shape or an infinite shape according to the size relation between the number of basic molecules forming the current synthetic shape and the molecular weight threshold until n continuously selected splicing molecules cannot be spliced with the current synthetic shape;
and when the number of the input combinations obtained by sampling reaches a sampling number threshold value, calculating the ratio of the limited shape number obtained by all the input combinations to the sampling number threshold value, and determining the ratio as the ratio of the number of the limited shapes formed in the actual chemical reaction to the number of the input combinations.
The embodiment of the present invention further provides a chemical molecule synthesis simulation apparatus, which is used to quickly perform a good estimation on the number of shapes that different types of base molecules and guide molecules can be finally synthesized, and greatly reduce the time cost brought by the actual chemical reaction, and the apparatus includes:
the random sampling module is used for respectively carrying out random sampling on the basic molecules and the guide molecules of the preset types before synthesizing the molecules each time to obtain an input combination consisting of one or more basic molecules and zero to more guide molecules;
for each input combination, the molecule selection module is used for randomly selecting a basic molecule from the input combinations as an initial shape, and for the initial shape, the splicing module, the molecule selection module and the judgment module execute the following methods:
the splicing module is used for traversing the guide molecules in the input combination, determining the guide molecules which can be spliced at each splicing position of the initial shape, and splicing all the guide molecules which can be spliced with the initial shape to obtain a synthetic shape;
the molecule selection module is also used for randomly selecting a basic molecule from the input combination as a splicing molecule;
the judging module is used for judging whether the splicing molecules can be spliced with at least one splicing position in all splicing positions of the synthetic shape;
the splicing module is used for randomly selecting one splicing position from at least one splicing position when the splicing molecules can be spliced with at least one splicing position in all splicing positions of the synthetic shape, and splicing the splicing molecules with the synthetic shape; taking the spliced molecules as a new initial shape, and triggering the splicing module, the molecule selection module and the judgment module to execute the method again;
the molecule selection module is also used for selecting the spliced molecules from the input combination again when the spliced molecules cannot be spliced with at least one splicing position in all splicing positions of the synthetic shape, triggering the judgment module to judge whether the spliced molecules can be spliced with the current synthetic shape, and triggering the determination module to determine that the current synthetic shape is a limited shape or an infinite shape according to the size relation between the number of basic molecules forming the current synthetic shape and a molecular weight threshold value until n spliced molecules which are continuously selected cannot be spliced with the current synthetic shape;
and the determining module is further used for calculating the ratio of the limited shape quantity obtained by all the input combinations to the sampling quantity threshold when the quantity of the sampled input combinations reaches the sampling quantity threshold, and determining the ratio as the ratio of the quantity of the limited shapes formed in the actual chemical reaction to the quantity of the input combinations.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the chemical molecule synthesis simulation method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program for executing the chemical molecule synthesis simulation method is stored.
In the embodiment of the invention, the actual chemical reaction process is simulated and simulated by a computer, and the number of the limited shapes formed in the actual chemical reaction is estimated by the number of the limited shapes obtained by simulation, so that the time cost brought by the actual chemical reaction can be greatly reduced, and the good estimation of the number of the finally formed synthetic shapes under the condition of huge types of basic molecules and guide molecules can be quickly obtained. Meanwhile, the shapes of partial finite shapes and infinite shapes can be obtained through simulation, more useful information can be obtained for the situation of the infinite shapes, and preparation is made for deducing the characteristics of the infinite shapes in the future.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
FIG. 1 is a flow chart of a chemical molecular synthesis simulation method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for simulating chemical molecule synthesis according to an embodiment of the present invention;
FIG. 3 is a flow chart of another method for simulating chemical molecule synthesis according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a chemical molecule synthesis simulation apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
The technical terms in the present invention will be briefly explained below.
1) Basic molecule: a minimal unit molecule for splicing in a synthetic environment, with a similar structure;
2) a guide molecule: small molecules in the synthesis environment that are small relative to the base molecule and thus can ignore the shape in the synthesis of the shape and that direct the synthesis of the base molecule (e.g., hinder or alter the manner of synthesis);
3) inputting and combining: a collection of molecular species consisting of certain base molecules and guide molecular species;
4) the splicing position is as follows: the basic molecule can be synthesized with other basic molecules, for example, there are four splice positions in the rectangular molecule.
The embodiment of the invention provides a chemical molecule synthesis simulation method, as shown in fig. 1, the method comprises steps 101 to 107:
Wherein the predetermined species of the base molecule and the guide molecule are set by a user. Illustratively, the user sets A, B, C three basic molecules, and the input combination obtained by random sampling may contain only A, only B, only C, AB, AC, BC, or ABC.
It should be noted that random sampling selects only the types of base molecules and guide molecules that constitute the input combination, and the number of each base molecule and each guide molecule in the input combination is considered to be infinite. And random sampling is allowed to get the same input combination.
The splicing mechanism of the guide molecule is that when any splicing position on the molecule (including the basic molecule and the synthesized molecule) can be spliced with the guide molecule, the guide molecule is spliced with the molecule, and all splicing positions capable of being spliced with the guide molecule are spliced with the guide molecule. That is, if there are 3 spliceable positions on the molecule that can be spliced with the guide molecule, the 3 spliceable positions are all spliced with the guide molecule.
When a guide molecule is spliced with a molecule, a spliceable position of the molecule may be occupied, a new spliceable position is not generated, and a new spliceable position of the molecule may also be occupied and converted.
It should be noted that there may be a case where neither of the guide molecules can be spliced to the original shape, and in this case, the original shape is directly used as the synthesized shape.
In another implementation, if the input combination does not contain a guide molecule, the initial shape is directly treated as the synthetic shape in step 103.
And 104, randomly selecting a basic molecule from the input combination as a splicing molecule.
Since the number of the same base molecule or guide molecule in an input combination is unlimited, the selected base molecule is the same as the previously selected base molecule in steps 102 and 104, and in the subsequent random selection of the base molecule.
And 105, judging whether the splicing molecules can be spliced with at least one splicing position in all the splicing positions of the synthetic shape.
Considering that the number of the stitching positions on the synthetic shape may be more and more as more molecules are stitched on the synthetic shape, in order to more clearly determine the stitching positions on the synthetic shape, after each time the initial shape or the synthetic shape is obtained, the information of all the stitching positions of the initial shape or the synthetic shape may be updated in the list of the stitchable positions. For example, after one basic molecule is selected as the initial shape in step 102, all the stitching positions on the initial shape are updated in the stitching position list, taking all the basic molecules as rectangular molecules as an example, the current stitching position list includes information of four stitching positions, namely, top, bottom, left and right, of the basic molecule; when the initial shape is spliced with 1 guide molecule to obtain a synthesized shape, the guide molecule occupies 1 splicing position and a new splicing position is not generated, and the splicing position occupied by the guide molecule is deleted from the splicing position list; when a splicing molecule and the synthesized shape are spliced successfully, one splicing position of each of the synthesized shape and the splicing molecule is occupied, at the moment, 2 splicing positions remain in the original synthesized shape, 3 splicing positions remain in the splicing molecule, the information of the occupied splicing position in the original synthesized shape is deleted from the updated splicing position list, and the information of the remaining 3 splicing positions in the splicing molecule is increased. Therefore, the list of the splicing positions is updated after the initial shape or the synthesized shape is obtained every time, the latest information of the splicing positions of the initial shape or the synthesized shape can be ensured to be directly obtained from the list of the splicing positions, and the matched splicing positions can be conveniently and quickly found in the next splicing process.
When the step 105 is executed specifically, the information of the splicing position of the spliced molecules can be obtained first; matching the splicing position information of the splicing molecules with the information of all splicing positions of the synthesized shape in the splicing position list one by one; and judging whether the splicing molecules can be spliced with at least one splicing position in all the splicing positions of the synthetic shape according to success or failure of matching.
Wherein successful matching indicates that the splice molecule can be spliced with at least one of all splice positions of the composite shape; failure to match indicates that the splice molecule cannot splice to at least one of all splice positions of the synthetic shape.
It should be noted that the information of the splicing positions of all the basic molecules is known and stored in the storage module of the device, and each basic molecule has a number, and the information of the splicing positions of each basic molecule can be searched and obtained in the storage module according to the number of the basic molecule. The information of the splicing position includes the part (e.g., four positions, i.e., upper, lower, left, and right, of the rectangular molecule) of the splicing position of the basic molecule, and also includes the requirement of each splicing position for the splicing molecule, and the splicing molecule can be spliced with the basic molecule when the splicing molecule satisfies the requirement.
Based on the above, the information of the splicing positions of the splicing molecules is matched with the information of all the splicing positions of the synthesized shape in the list of the splicing positions one by one, that is, the process of judging whether the splicing positions in the splicing molecules and the synthesized shape meet the mutual requirements is present, if the splicing positions meet the mutual requirements, that is, the matching is successful, the splicing positions of the synthesized shape which are successfully matched can be stored in the list of the selectable positions for recording.
That is, if there are 3 splicing positions on the synthesized shape that can be spliced with the spliced molecules, one splicing position is randomly selected from the 3 splicing positions for splicing to obtain the spliced molecules, i.e., a new synthesized shape.
And 107, if the current synthesized shape cannot be obtained, selecting splicing molecules from the input combination again, and judging whether the splicing molecules can be spliced with the current synthesized shape or not until n continuously selected splicing molecules cannot be spliced with the current synthesized shape, and determining that the current synthesized shape is a limited shape or an infinite shape according to the size relation between the number of basic molecules forming the current synthesized shape and the molecular weight threshold.
Wherein n is a threshold value set by a user. Taking n as 100 as an example, if none of the continuously selected 100 spliced molecules can be spliced with the current synthesized shape, it is considered that none of the basic molecules in the input combination can be spliced with the current synthesized shape, and the synthesis simulation process is terminated.
Referring to fig. 2, the step 107 determines whether the current synthesized shape is a finite shape or an infinite shape according to the size relationship between the number of base molecules constituting the current synthesized shape and the molecular weight threshold, which may be specifically executed as the following steps 201 to 203:
Wherein the molecular weight threshold is a base molecular number limit that demarcates finite shapes and infinite shapes.
And step 203, if the number of basic molecules forming the current synthesized shape is less than the molecular weight threshold value, determining the current synthesized shape as a limited shape.
In the embodiment of the invention, after the current synthetic shape is determined to be a finite shape or an infinite shape according to the size relationship between the number of basic molecules forming the current synthetic shape and the molecular weight threshold, the finite shape obtained by the current input combination is compared with the finite shape obtained in the current chemical molecule synthesis simulation; if the finite shape obtained by the current input combination is different from any one of the finite shapes obtained, storing the finite shape obtained by the current input combination and the type information of the basic molecules and the guide molecules which compose the finite shape.
And 108, when the number of the input combinations obtained by sampling reaches a sampling number threshold, calculating the ratio of the limited shape number obtained by all the input combinations to the sampling number threshold, and determining the ratio as the ratio of the limited shape number formed in the actual chemical reaction to the input combination number.
On the basis of storing nonrepetitive finite shapes in step 107, in this step, when the number of input combinations obtained by sampling reaches the threshold value of the number of samples, the ratio of the number of finite shapes obtained by all input combinations to the threshold value of the number of samples is calculated according to the number of finite shapes obtained by the current input combinations that have already been stored.
Furthermore, after step 107 is performed, the number of finite shapes formed in the actual chemical reaction can also be estimated using the following method:
directly estimating the number of the finite shapes in the actual chemical reaction by using the number of the finite shapes obtained after sampling;
and secondly, estimating the average number of the finite shapes in the actual chemical reaction by using the average number of the finite shapes obtained after sampling to the sampling times.
The method (i) is always a lower limit estimation on the real situation, and according to the law of large numbers, all finite shapes can be obtained with probability greater than 0 in the infinite sampling. The second method is an estimation method for real conditions in the traditional method, and the sampling average is used for estimating the real average, but when the problem is solved, too large or too small sampling can cause inaccurate estimation, too complex parameter adjustment and inaccurate actual result because the repeated limited shapes are not counted; if repeated shapes are saved and counted, more storage space may be required.
The method in the embodiment of the invention can ensure that a good estimation can be obtained on the number of real limited shapes no matter whether the sampling number is more than the number of real input combinations or not and whether the sampling number is more than the number of basic molecules in the current combination or not, and simultaneously ensures that the storage space is smaller.
For ease of understanding, fig. 3 shows another flow chart of the molecular synthesis simulation method in the embodiment of the present invention, and the implementation of the above steps can also refer to the flow chart shown in fig. 3.
In the embodiment of the invention, the actual chemical reaction process is simulated and simulated by a computer, and the number of the limited shapes formed in the actual chemical reaction is estimated by the number of the limited shapes obtained by simulation, so that the time cost brought by the actual chemical reaction can be greatly reduced, and the good estimation of the number of the finally formed synthetic shapes under the condition of huge types of basic molecules and guide molecules can be quickly obtained. Meanwhile, the shapes of partial finite shapes and infinite shapes can be obtained through simulation, more useful information can be obtained for the situation of the infinite shapes, and preparation is made for deducing the characteristics of the infinite shapes in the future.
The embodiment of the invention also provides a chemical molecule synthesis simulation device, which is described in the following embodiment. Because the principle of solving the problems of the device is similar to that of the chemical molecule synthesis simulation method, the implementation of the device can refer to the implementation of the chemical molecule synthesis simulation method, and repeated parts are not described again.
As shown in fig. 4, the apparatus 400 includes a random sampling module 401, a molecular selection module 402, a judgment module 403, a concatenation module 404, and a determination module 405.
The random sampling module 401 is configured to perform random sampling on a preset type of base molecule and guide molecule respectively before synthesizing molecules each time, so as to obtain an input combination consisting of one or more base molecules and zero or more guide molecules;
for each input combination, a molecule selection module 402 for randomly selecting a base molecule from the input combinations as an initial shape, for which the following method is performed by the concatenation module 404, the molecule selection module 402 and the judgment module 403:
a splicing module 404, configured to traverse the bootstrap molecules in the input combination, determine the bootstrap molecules that can be spliced at each spliceable position of the initial shape, and splice all the spliceable bootstrap molecules with the initial shape to obtain a composite shape;
a molecule selection module 402, configured to randomly select a basic molecule from the input combination as a splicing molecule;
a judging module 403, configured to judge whether the splicing molecule can be spliced with at least one of all splicing positions of the composite shape;
a splicing module 404, configured to randomly select one splicing position from the at least one splicing position when the splicing molecule can be spliced with at least one splicing position of all splicing positions of the synthetic shape, and splice the splicing molecule with the synthetic shape; taking the spliced molecules as a new initial shape, and triggering the splicing module, the molecule selection module and the judgment module to execute the method again;
the molecule selection module 402 is further configured to, when the splice molecule cannot be spliced with at least one of all splice positions of the composite shape, select a splice molecule from the input combination again, and trigger the judgment module 403 to judge whether the splice molecule can be spliced with the current composite shape, until none of the n splice molecules selected continuously can be spliced with the current composite shape, trigger the determination module 405 to determine that the current composite shape is a finite shape or an infinite shape according to a size relationship between the number of base molecules constituting the current composite shape and a molecular weight threshold;
the determining module 405 is further configured to calculate a ratio of the number of the finite shapes obtained from all the input combinations to the sampling number threshold when the number of the sampled input combinations reaches the sampling number threshold, and determine the ratio as the ratio of the number of the finite shapes formed in the actual chemical reaction to the number of the input combinations.
In an implementation manner of the embodiment of the present invention, the apparatus 400 further includes a list updating module 406, configured to: after each time the initial shape or the synthesized shape is obtained, updating the information of all the splicing positions of the initial shape or the synthesized shape in a selectable position list;
a determining module 403, configured to:
acquiring splicing position information of spliced molecules;
matching the splicing position information of the splicing molecules with the information of all splicing positions of the synthesized shape in the splicing position list one by one;
and judging whether the splicing molecules can be spliced with at least one splicing position in all the splicing positions of the synthetic shape according to the success or failure of matching.
In an implementation manner of the embodiment of the present invention, the determining module 405 is configured to:
judging whether the number of basic molecules forming the current synthetic shape is greater than or equal to a molecular weight threshold value or not;
when the number of basic molecules forming the current synthetic shape is larger than or equal to the molecular weight threshold value, determining the current synthetic shape as an infinite shape;
when the number of base molecules constituting the current synthetic shape is less than the molecular weight threshold, the current synthetic shape is determined to be a finite shape.
In one implementation manner of the embodiment of the present invention, the apparatus 400 further includes a shape recording module 407 configured to:
comparing the limited shape obtained by the current input combination with the limited shape obtained in the chemical molecule synthesis simulation;
if the finite shape obtained by the current input combination is different from any one of the finite shapes obtained, storing the finite shape obtained by the current input combination and the type information of the basic molecules and the guide molecules which compose the finite shape.
In another implementation manner of the embodiment of the present invention, the determining module 405 is configured to:
and when the number of the input combinations obtained by sampling reaches a sampling number threshold value, calculating the ratio of the limited shape number obtained by all the input combinations to the sampling number threshold value according to the stored limited shape number obtained by the current input combination.
In the embodiment of the invention, the actual chemical reaction process is simulated and simulated by a computer, and the number of the limited shapes formed in the actual chemical reaction is estimated by the number of the limited shapes obtained by simulation, so that the time cost brought by the actual chemical reaction can be greatly reduced, and the good estimation of the number of the finally formed synthetic shapes under the condition of huge types of basic molecules and guide molecules can be quickly obtained. Meanwhile, the shapes of partial finite shapes and infinite shapes can be obtained through simulation, more useful information can be obtained for the situation of the infinite shapes, and preparation is made for deducing the characteristics of the infinite shapes in the future.
An embodiment of the present invention further provides a computer device, and fig. 5 is a schematic diagram of a computer device in an embodiment of the present invention, where the computer device is capable of implementing all steps in chemical molecule synthesis simulation in the above embodiment, and the computer device specifically includes the following contents:
a processor (processor)501, a memory (memory)502, a communication Interface (Communications Interface)503, and a communication bus 504;
the processor 501, the memory 502 and the communication interface 503 complete mutual communication through the communication bus 504; the communication interface 503 is used for implementing information transmission between related devices;
the processor 501 is used to call the computer program in the memory 502, and when the processor executes the computer program, the chemical molecule synthesis simulation method in the above embodiment is implemented.
An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program for executing the chemical molecule synthesis simulation method is stored.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and should not be used to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method for chemical molecular synthesis simulation, the method comprising:
before synthesizing molecules each time, randomly sampling in preset types of basic molecules and guide molecules respectively to obtain an input combination consisting of one or more basic molecules and zero to more guide molecules;
for each input combination, randomly selecting a base numerator from the input combination as an initial shape, for which the following method is performed:
traversing the guide molecules in the input combination, determining the guide molecules which can be spliced at each splicing position of the initial shape, and splicing all the guide molecules which can be spliced with the initial shape to obtain a synthetic shape;
randomly selecting a basic molecule from the input combination as a splicing molecule;
judging whether the splicing molecules can be spliced with at least one splicing position in all splicing positions of the synthesized shape;
if so, randomly selecting one splicing position from the at least one splicing position, and splicing the splicing molecules with the synthetic shape; the spliced molecules are used as a new initial shape, and the method is executed again;
if not, selecting splicing molecules from the input combination again, and judging whether the splicing molecules can be spliced with the current synthetic shape or not, and determining that the current synthetic shape is a limited shape or an infinite shape according to the size relation between the number of basic molecules forming the current synthetic shape and the molecular weight threshold until n continuously selected splicing molecules cannot be spliced with the current synthetic shape;
and when the number of the input combinations obtained by sampling reaches a sampling number threshold value, calculating the ratio of the limited shape number obtained by all the input combinations to the sampling number threshold value, and determining the ratio as the ratio of the number of the limited shapes formed in the actual chemical reaction to the number of the input combinations.
2. The method of claim 1,
after each deriving of the initial or composite shape, the method further comprises: updating information of all splicing positions of the initial shape or the composite shape in a splicing position list;
judging whether the splicing molecule can be spliced with at least one splicing position in all splicing positions of the synthetic shape, comprising:
acquiring splicing position information of spliced molecules;
matching the splicing position information of the splicing molecules with the information of all splicing positions of the synthesized shape in the splicing position list one by one;
and judging whether the splicing molecules can be spliced with at least one splicing position in all the splicing positions of the synthetic shape according to the success or failure of matching.
3. The method of claim 1, wherein determining the current synthetic shape as a finite shape or an infinite shape according to a size relationship between a number of base molecules constituting the current synthetic shape and a molecular weight threshold comprises:
judging whether the number of basic molecules forming the current synthetic shape is greater than or equal to a molecular weight threshold value or not;
if the number of basic molecules forming the current synthetic shape is larger than or equal to the molecular weight threshold, determining the current synthetic shape as an infinite shape;
if the number of base molecules that make up the current composite shape is less than the molecular weight threshold, the current composite shape is determined to be a finite shape.
4. The method of claim 3, wherein after determining the current synthetic shape to be a finite shape or an infinite shape based on a size relationship between a number of base molecules constituting the current synthetic shape and a molecular weight threshold, the method further comprises:
comparing the limited shape obtained by the current input combination with the limited shape obtained in the chemical molecule synthesis simulation;
if the finite shape obtained by the current input combination is different from any one of the finite shapes obtained, storing the finite shape obtained by the current input combination and the type information of the basic molecules and the guide molecules which compose the finite shape.
5. The method of claim 4, wherein calculating a ratio of the number of finite shapes resulting from all input combinations to the sample number threshold when the number of sampled input combinations reaches the sample number threshold comprises:
and when the number of the input combinations obtained by sampling reaches a sampling number threshold value, calculating the ratio of the limited shape number obtained by all the input combinations to the sampling number threshold value according to the stored limited shape number obtained by the current input combination.
6. A chemical molecule synthesis simulation apparatus, comprising:
the random sampling module is used for respectively carrying out random sampling on the basic molecules and the guide molecules of the preset types before synthesizing the molecules each time to obtain an input combination consisting of one or more basic molecules and zero to more guide molecules;
for each input combination, a molecule selection module for randomly selecting a basic molecule from the input combinations as an initial shape, and for the initial shape, the following method is executed by the splicing module, the molecule selection module and the judgment module:
the splicing module is used for traversing the guide molecules in the input combination, determining the guide molecules which can be spliced at each splicing position of the initial shape, and splicing all the guide molecules which can be spliced with the initial shape to obtain a synthetic shape;
the molecule selection module is also used for randomly selecting a basic molecule from the input combination as a splicing molecule;
the judging module is used for judging whether the splicing molecules can be spliced with at least one splicing position in all splicing positions of the synthetic shape;
the splicing module is used for randomly selecting one splicing position from at least one splicing position when the splicing molecules can be spliced with at least one splicing position in all splicing positions of the synthetic shape, and splicing the splicing molecules with the synthetic shape; taking the spliced molecules as a new initial shape, and triggering the splicing module, the molecule selection module and the judgment module to execute the method again;
the molecule selection module is further used for selecting spliced molecules from the input combination again when the spliced molecules cannot be spliced with at least one splicing position in all splicing positions of the synthetic shape, triggering the judgment module to judge whether the spliced molecules can be spliced with the current synthetic shape, and triggering the determination module to determine that the current synthetic shape is a limited shape or an infinite shape according to the size relation between the number of basic molecules forming the current synthetic shape and a molecular weight threshold value until n continuously selected spliced molecules cannot be spliced with the current synthetic shape;
and the determining module is further used for calculating the ratio of the limited shape quantity obtained by all the input combinations to the sampling quantity threshold when the quantity of the sampled input combinations reaches the sampling quantity threshold, and determining the ratio as the ratio of the quantity of the limited shapes formed in the actual chemical reaction to the quantity of the input combinations.
7. The apparatus of claim 6, wherein the means for determining is configured to:
judging whether the number of basic molecules forming the current synthetic shape is greater than or equal to a molecular weight threshold value or not;
when the number of basic molecules forming the current synthetic shape is larger than or equal to the molecular weight threshold value, determining the current synthetic shape as an infinite shape;
when the number of base molecules constituting the current synthetic shape is less than the molecular weight threshold, the current synthetic shape is determined to be a finite shape.
8. The apparatus of claim 7, further comprising a shape recording module to:
comparing the finite shape obtained by the current input combination with the finite shape obtained in the chemical molecule synthesis simulation;
if the finite shape obtained by the current input combination is different from any one of the finite shapes obtained, storing the finite shape obtained by the current input combination and the type information of the basic molecules and the guide molecules which compose the finite shape.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448408.5A CN113140262B (en) | 2021-04-25 | 2021-04-25 | Chemical molecule synthesis simulation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110448408.5A CN113140262B (en) | 2021-04-25 | 2021-04-25 | Chemical molecule synthesis simulation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113140262A CN113140262A (en) | 2021-07-20 |
CN113140262B true CN113140262B (en) | 2022-05-03 |
Family
ID=76811968
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110448408.5A Active CN113140262B (en) | 2021-04-25 | 2021-04-25 | Chemical molecule synthesis simulation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113140262B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390997A (en) * | 2019-07-17 | 2019-10-29 | 成都火石创造科技有限公司 | A kind of chemical molecular formula joining method |
CN111816265A (en) * | 2020-06-30 | 2020-10-23 | 北京晶派科技有限公司 | Molecule generation method and computing device |
CN111899807A (en) * | 2020-06-12 | 2020-11-06 | 中国石油天然气股份有限公司 | Molecular structure generation method, system, equipment and storage medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10332619B2 (en) * | 2014-10-13 | 2019-06-25 | Samsung Electronics Co., Ltd. | Methods and apparatus for in silico prediction of chemical reactions |
-
2021
- 2021-04-25 CN CN202110448408.5A patent/CN113140262B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390997A (en) * | 2019-07-17 | 2019-10-29 | 成都火石创造科技有限公司 | A kind of chemical molecular formula joining method |
CN111899807A (en) * | 2020-06-12 | 2020-11-06 | 中国石油天然气股份有限公司 | Molecular structure generation method, system, equipment and storage medium |
CN111816265A (en) * | 2020-06-30 | 2020-10-23 | 北京晶派科技有限公司 | Molecule generation method and computing device |
Non-Patent Citations (3)
Title |
---|
A synthesis flow for digital signal processing with biomolecular reactions;Hua Jiang etc.;《2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)》;20101230;全文 * |
基于深度学习的化合物逆合成系统设计与实现;郭世豪;《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅰ辑》;20200815;全文 * |
计算机分子模拟技术及人工智能在药物研发中的应用;刘景陶等;《科技创新与应用》;20180118(第02期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113140262A (en) | 2021-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109118353B (en) | Data processing method and device of wind control model | |
US20120330598A1 (en) | Method, program, and system for generating test cases | |
CN108984171B (en) | Continuous scene construction method based on Unity3D and storage medium | |
CN112905451A (en) | Automatic testing method and device for application program | |
CN110531977B (en) | Automatic control method and device for instrument, computer equipment and storage medium | |
CN110990001A (en) | IVR (Interactive Voice response) process execution method and device | |
US5878407A (en) | Storage of a graph | |
CN115220899A (en) | Model training task scheduling method and device and electronic equipment | |
CN116431520A (en) | Test scene determination method, device, electronic equipment and storage medium | |
CN113140262B (en) | Chemical molecule synthesis simulation method and device | |
CN106294530B (en) | The method and system of rule match | |
CN113140261B (en) | Chemical molecule synthesis simulation method and device | |
CN104809067A (en) | Equality constraint-oriented test case generation method and device | |
CN114840856B (en) | State-aware Internet of things trusted execution environment fuzzy test method and system | |
CN109271413A (en) | A kind of method, apparatus and computer storage medium of data query | |
CN106126056B (en) | PowerPoint-based slide automatic creation method and device | |
CN114594960A (en) | Recursive function analysis execution method, device and storage medium | |
US7305373B1 (en) | Incremental reduced error pruning | |
CN115114136A (en) | Test data generation method and device, electronic equipment and program product | |
Štefanič et al. | A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications | |
CN111949505B (en) | Test method, device and equipment | |
CN110021342A (en) | For accelerating the method and system of the identification of variant sites | |
CN108958654B (en) | Management method and related device of storage system | |
US5996053A (en) | Method and apparatus for fetching classified and stored information | |
CN116631505A (en) | SNP-SNP interaction detection method, system, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |