CN114550841A - Prediction method of chemical reaction product - Google Patents

Prediction method of chemical reaction product Download PDF

Info

Publication number
CN114550841A
CN114550841A CN202011302735.1A CN202011302735A CN114550841A CN 114550841 A CN114550841 A CN 114550841A CN 202011302735 A CN202011302735 A CN 202011302735A CN 114550841 A CN114550841 A CN 114550841A
Authority
CN
China
Prior art keywords
reaction
chemical
products
product
chemical reaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011302735.1A
Other languages
Chinese (zh)
Inventor
夏宁
孟超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Zhihua Technology Co ltd
Original Assignee
Wuhan Zhihua Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Zhihua Technology Co ltd filed Critical Wuhan Zhihua Technology Co ltd
Priority to CN202011302735.1A priority Critical patent/CN114550841A/en
Publication of CN114550841A publication Critical patent/CN114550841A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/10Analysis or design of chemical reactions, syntheses or processes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

The invention provides a prediction method of a chemical reaction product, which comprises the following steps: a) establishing a database containing chemical reaction literature, calculating all reactions possibly occurring in a single step between initial molecules according to the chemical reactions recorded in the literature, and performing chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products; b) performing multiple rounds of operation on the screened product obtained in the step a) and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screening the product structure under each molecular weight to find out the most possible structure. The prediction method can realize forward prediction of reaction products through reaction raw materials, quickly and fully automatically calculate all products in a single chemical reaction, accurately judge the structure of the products according to the molecular weight, and give the synthesis paths of all the products while predicting all the products.

Description

Prediction method of chemical reaction product
Technical Field
The invention relates to the technical field of data processing, in particular to a prediction method of a chemical reaction product.
Background
In the chemical field of pharmacy and the like, various organic molecules need to be synthesized in the process of drug or product research and development, and in the process production route of chemical molecules, the main product of each reaction step is analyzed, and sometimes the by-products of the chemical reactions need to be researched, analyzed and controlled, so that the process is difficult and mainly depends on the experience and knowledge of chemists. With the development of information technology, an ever-growing database of massive organic synthesis knowledge is built, chemists can inquire the database about reactions which can further occur in a compound, then predict the reactions by combining chemical knowledge and experience of the chemists, and compare and verify the predicted results by LCMS or NMR. However, this process is currently performed manually, which undoubtedly requires an extremely high manual understanding of the overall knowledge of the synthesis, and the completion of the whole process takes a lot of time for the top chemists. Therefore, the method for predicting all products of the chemical reaction by utilizing big data and chemical informatics technology has very important application value.
At present, the basic principle of the relatively new organic synthesis route prediction software on the market, such as chemplaner, is to extract the changed part (called reaction center) in the reaction according to a large amount of chemical reaction data, then match the molecule to be synthesized with the reaction center, and push the molecule to be synthesized step by step until the chemical raw materials are available, thereby obtaining the complete synthesis route of the synthesized target molecule. Therefore, the computer-aided synthesis technology in the prior art is based on reverse synthesis to infer a reaction route, while the reverse synthesis or reverse derivation is based on a small amount of main products, and all possible reactions are derived in the forward direction, so that the computational difficulty and the computational complexity are significantly increased, and all the products in a single chemical reaction cannot be found.
Disclosure of Invention
In view of the above, the present invention provides a method for predicting a chemical reaction product, which can achieve forward prediction of a reaction product through reaction raw materials, help chemists predict all products in a single chemical reaction, and further guide reaction practice.
The invention provides a prediction method of a chemical reaction product, which comprises the following steps:
a) establishing a database containing chemical reaction literature, calculating all reactions possibly occurring in a single step between initial molecules according to the chemical reactions recorded in the literature, and performing chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products;
b) performing multiple rounds of operation on the screened product obtained in the step a) and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screening the product structure under each molecular weight to find out the most possible structure.
Preferably, the process of establishing the database containing the chemical reaction literature in the step a) is specifically as follows:
a1) storing the collected various chemical reactions in advance through a computer, converting the chemical reactions into a computer storage format, and then performing data processing to obtain one-to-one correspondence information of atoms in reactants and products;
a2) according to the corresponding information, identifying the reaction sites, and extracting the atoms, chemical bonds and groups which are directly connected with or conjugated with the reaction sites and the groups which are indirectly connected with the reaction sites and influence the reaction as the information for identifying the chemical reaction, and further storing the information in a database as the reaction rule.
Preferably, the various chemical reactions described in step a1) include conventional chemical reactions, classical organic human reactions, chemical reactions reported in academic journals and chemical reactions reported in patents.
Preferably, the reactive sites in step a2) comprise altered chemical bonds and atoms to which these bonds are directly attached;
the identification process specifically comprises the following steps:
by comparing the chemical structures of the starting materials and the products in the reaction, the altered chemical bonds and the atoms to which these bonds are directly attached are found.
Preferably, the criteria for the judgment of the rationality of the chemical structure in step a) include:
removing compounds containing unreasonable chemical structures; removing the product structure with the increased chemical element types after reaction; removing structures that have already appeared; products that do not match the reaction conditions of the reaction fingerprint derived literature chemical reaction at all are removed.
Preferably, the process of judging the rationality of the chemical structure further comprises:
sequencing results of products meeting the standard according to the difficulty degree of reaction of the products according to the reaction rule; the difficulty degree of the reaction is determined according to the frequency of the literature corresponding to the chemical reaction rule in the database.
Preferably, the process of determining the accuracy of the chemical structure in the step a) specifically comprises:
establishing an easily-ionized group structure which can influence the LCMS peak output of the compound and an ionized group database of the influence of an adducted ion peak on the molecular weight of the compound according to chemical reaction literature information;
then inputting all the information of the compounds to be reacted, and molecular weight data obtained by detecting the product reaction liquid or the product crude product obtained after the chemical reaction operation in a liquid phase mass spectrometer, calculating the molecular weight of the obtained product data, matching the product structure with the group structure in the ionized group database, simultaneously carrying out the calculation of the molecular weight of the compounds, comparing the chemical reaction product data obtained after the calculation with the molecular weight information obtained by detecting the reaction liquid after the actual chemical reaction, and finding out the compounds with consistent molecular weight, namely the screened products.
Preferably, the multi-round operation process in step b) specifically includes:
b1) carrying out one operation on each screened product and all initial compounds to obtain a new reaction product;
b2) repeating the operation process until all the possible reaction products are obtained in the system during each single-step chemical reaction.
Preferably, the operation process in step b1) is specifically:
and according to the chemical reaction rule, after the product is calculated from the raw materials, screening and sequencing the product results.
Preferably, the screening process in step b) is specifically:
sequencing results of the product structures under each molecular weight according to the reaction rule and the difficulty degree of reaction; the difficulty degree of the reaction is determined according to the frequency of the literature corresponding to the chemical reaction rule in the database.
The invention provides a prediction method of a chemical reaction product, which comprises the following steps: a) establishing a database containing chemical reaction literature, calculating all reactions possibly occurring in a single step between initial molecules according to the chemical reactions recorded in the literature, and performing chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products; b) performing multiple rounds of operation on the screened product obtained in the step a) and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screening the product structure under each molecular weight to find out the most possible structure. Compared with the prior art, the prediction method of the chemical reaction product can realize forward prediction of the reaction product through the reaction raw materials; all products in a single chemical reaction can be rapidly and fully automatically calculated, the structure of the product can be accurately judged according to the molecular weight, the synthesis paths of all the products can be provided while all the products are predicted, the reference of scientific research personnel is facilitated, and the reaction practice is guided.
In addition, the prediction method provided by the invention can be combined with various detection means, such as mass spectrum, nuclear magnetic hydrogen spectrum and carbon spectrum, so that the data is perfected, and the method has a wider application prospect.
Drawings
FIG. 1 is a flow chart of the determination of the accuracy of chemical structures in the prediction method provided by the present invention;
FIG. 2 is a flow chart of a method for predicting chemical reaction products provided by the present invention;
FIG. 3 is a flow chart of an embodiment of the present invention for calculating all products in a chemical reaction;
FIG. 4 shows the predicted results of example 1 of the present invention;
FIG. 5 shows the predicted results of example 2 of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments of the present invention, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a prediction method of a chemical reaction product, which comprises the following steps:
a) establishing a database containing chemical reaction literature, calculating all reactions possibly occurring in a single step between initial molecules according to the chemical reactions recorded in the literature, and performing chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products;
b) performing multiple rounds of operation on the screened product obtained in the step a) and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screening the product structure under each molecular weight to find out the most possible structure.
The invention first establishes a database containing chemical reaction literature, namely a chemical reaction database. In the present invention, the process of establishing the database containing the chemical reaction literature is preferably embodied as follows:
a1) storing the collected various chemical reactions in advance through a computer, converting the chemical reactions into a computer storage format, and then performing data processing to obtain one-to-one correspondence information of atoms in reactants and products;
a2) according to the corresponding information, identifying the reaction sites, and extracting the atoms, chemical bonds and groups which are directly connected with or conjugated with the reaction sites and the groups which are indirectly connected with the reaction sites and influence the reaction as the information for identifying the chemical reaction, and further storing the information in a database as the reaction rule.
In the present invention, the various chemical reactions preferably include conventional chemical reactions, classical organic human name reactions, chemical reactions reported in academic journals, and chemical reactions reported in patents. On the basis, the invention can also expand and update various chemical reactions in real time according to new research results; meanwhile, the data in the established database is also the data obtained after a chemist is required to perform a single chemical reaction.
In the present invention, the reaction site preferably comprises altered chemical bonds and atoms to which these bonds are directly attached; on this basis, the process of identification is preferably specifically:
by comparing the chemical structures of the starting materials and the products in the reaction, the altered chemical bonds and the atoms to which these bonds are directly attached are found.
After the database is established, the method calculates all reactions possibly occurring in a single step among initial molecules according to the chemical reactions recorded in the literature, and performs chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products. In the present invention, the starting molecule includes all the starting compounds such as reagents, solvents, reaction raw materials, and the like; calculating all possible reactions of the single step between the initial molecules on the basis of the established database according to the information of all the initial compounds to obtain all possible products of the single step; the limitation of all possible reactions between the starting molecules in a single step is that the reaction does not take place at will with the input molecules (or starting materials), but with the limitation that the reaction takes place with the starting materials required for the reaction at least contained in the collection of starting molecules.
In the invention, more than one reaction possibly occurs in the single step of the starting molecule is adopted, and the reasonability judgment and the accuracy judgment of the chemical structure of the reaction product in the technical scheme are started while each calculation result is obtained; therefore, the calculation, the judgment of the rationality of the chemical structure and the judgment of the accuracy of the chemical structure have a sequential order from the viewpoint of a single chemical reaction, but no clear order limitation exists on the basis of all possible reactions.
In the present invention, the criterion for the judgment of the rationality of the chemical structure preferably includes:
removing compounds containing unreasonable chemical structures; removing the product structure with the increased chemical element types after reaction; removing structures that have already appeared; products that do not match the reaction conditions of the reaction fingerprint derived literature chemical reaction at all are removed. On this basis, products which do not meet the above criteria are directly excluded.
In the present invention, the process of judging the rationality of the chemical structure preferably further includes:
and sequencing results of the products meeting the standard according to the difficulty degree of reaction of the reaction rules. In the present invention, the degree of difficulty of the reaction is preferably determined by the frequency of occurrence of the literature corresponding to the reaction in the database according to the chemical reaction rule. According to the reaction rule extracted by the chemical reaction, only a part of the change of the reaction center is taken, so that different chemical reactions correspond to the same chemical rule, and the quantity of the chemical reactions refers to the quantity of documents; the difficulty in determining the reaction rule by frequency is due to: some reactions that occur more readily, as a result of analyzing literature data, are found to occur more frequently in the literature. On the basis, the result sorting is realized according to the possibility that the products are preferably selected according to the products deduced by the reactions which are easy to occur.
In the present invention, the process of accurately judging the chemical structure preferably includes:
establishing an easily-ionized group structure which can influence the LCMS peak output of the compound and an ionized group database of the influence of an adducted ion peak on the molecular weight of the compound according to chemical reaction literature information;
and then inputting all the information of the compounds to be reacted, and molecular weight data obtained by detecting a product reaction liquid or a product crude product obtained after the chemical reaction operation in a liquid phase mass spectrometer, calculating the molecular weight of the obtained product data, matching the product structure with a group structure in an ionized group database, simultaneously calculating the molecular weight of the compounds, comparing the chemical reaction product data obtained after the calculation with the molecular weight information obtained by detecting the reaction liquid after the actual chemical reaction, and finding out the compounds with consistent molecular weights, namely the screened products.
Therefore, in the process of judging the accuracy of the chemical structure, the ionized group database is established firstly, and then judgment is carried out according to the database, so that the product reasonable in single-step reaction is further judged through molecular weight to obtain an accurate product, namely the screened product.
In the present invention, the LCMS (liquid mass spectrometer) obtains molecular weight data, and may also be referenced by other detection means, such as Nuclear Magnetic (NMR) hydrogen spectra, carbon spectra.
In the present invention, in the process of inputting all the information of the compounds to be reacted, the compounds include impurities brought by the raw materials, the compounds can be in a mol format, a smiles format or other information capable of being converted into structural data, such as CAS number or name, and common reagents can be in English name.
In the invention, the process of accurately judging the chemical structure is a process of inputting the information of all the compounds to be reacted on the basis of the ionized group database to obtain a screened product; the detailed flow chart is shown in fig. 1, wherein MS is mass spectrum.
In the invention, because partial chemical reaction intermediates cannot obtain peaks and molecular weights in liquid phase mass spectrum detection due to the influence of some groups, products with the structures cannot be screened out, and possible products cannot be filtered out; specifically, at the stage of judging the chemical structure reasonability and the chemical structure accuracy of the compound, the product is obtained by judging whether the chemical structure is reasonable, for example, a carbon in a certain product is connected with 5 atoms, and the like, the unreasonable chemical structure is obtained, whether the molecular weight of the generated product can be matched with one of the molecular weights of the LCMS which is input in advance is determined, the inaccurate molecular weight matching is the result is inaccurate, however, there are some compounds which themselves may be due to certain conditions, molecular weights which are not detected by the LCMS machine, so there is no molecular weight in the input LCMS molecular weight, but this compound is actually present, it is therefore necessary to bring these molecules into the next calculation, and these molecules without LCMS molecular weight are usually characterized by some specific characteristics, these molecules can be retrieved from the chemical structure accuracy stage for substitution into the next operation.
After the screened product is obtained, the invention carries out multiple rounds of operation on the screened product and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screens the product structure under each molecular weight to find out the most possible structure. In the present invention, all the starting compounds are the same as those described in the above technical schemes and are not described herein again.
In the present invention, the process of the multi-round operation is preferably as follows:
b1) carrying out one operation on each screened product and all initial compounds to obtain a new reaction product;
b2) repeating the operation process until all the possible reaction products are obtained in the system during each single-step chemical reaction.
In the process of the above-mentioned multiple rounds of calculation, firstly, each screened product and all the initial compounds are subjected to one calculation to obtain a new reaction product. In the present invention, the operation process is preferably as follows:
and according to the chemical reaction rule, after the product is calculated from the raw materials, screening and sequencing the product results. In the present invention, the chemical reaction rules are the database containing chemical reaction literature and the database containing ionized groups established in the above technical scheme.
After the new reaction product is obtained, the invention repeats the operation process on the obtained new reaction product until all possible reaction products in the system are pushed out in the process of each single-step chemical reaction. In the present invention, the process up to the derivation of all the reaction products that may be produced in the system during each single-step chemical reaction is generally completed after two to three steps of operations.
Finally, the invention screens the product structure under each molecular weight to find out the most possible structure. In the present invention, the screening process preferably includes:
and (4) sequencing the results of the product structures under each molecular weight according to the reaction rule and the difficulty degree of the reaction. In the invention, the difficulty level of the reaction is determined according to the frequency of the occurrence of documents corresponding to the chemical reaction rules in the database. According to the reaction rule extracted by the chemical reaction, only a part of the change of the reaction center is taken, so that different chemical reactions correspond to the same chemical rule, and the quantity of the chemical reactions refers to the quantity of documents; the difficulty of judging the reaction rule by frequency is due to the following: some reactions that occur more readily, as a result of analyzing literature data, are found to occur more frequently in the literature. On the basis, the result sorting is realized according to the possibility that the products deduced according to the reactions which are easy to occur are the preferable screening basis, so that the most possible corresponding products under each molecular weight are found out.
In the present invention, the flow chart of the above-mentioned overall technical solution is shown in fig. 2. Therefore, the technical scheme provided by the invention realizes forward prediction of reaction products through reaction raw materials for the first time, and carries out result sorting through MS data combined with the basis of possibility screening, so that the prediction result is accurate and reliable, a chemist can be helped to more quickly and comprehensively predict all products generated in a chemical reaction, the whole process is judged by adopting a program without manual operation, and the method is convenient for scientific research personnel to refer to the product, and further guides the reaction practice.
The invention provides a prediction method of a chemical reaction product, which comprises the following steps: a) establishing a database containing chemical reaction literature, calculating all reactions possibly occurring in a single step between initial molecules according to the chemical reactions recorded in the literature, and performing chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products; b) performing multiple rounds of operation on the screened product obtained in the step a) and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screening the product structure under each molecular weight to find out the most possible structure. Compared with the prior art, the prediction method of the chemical reaction product can realize forward prediction of the reaction product through the reaction raw materials; all products in a single chemical reaction can be calculated quickly and automatically, the structure of the product can be judged accurately according to the molecular weight, the synthesis paths of all the products can be given while all the products are predicted, scientific research personnel can refer conveniently, and reaction practice can be guided.
In addition, the prediction method provided by the invention can be combined with various detection means, such as mass spectrum, nuclear magnetic hydrogen spectrum and carbon spectrum, so that the data is perfected, and the method has a wider application prospect.
To further illustrate the present invention, the following examples are provided for illustration.
An example of calculating all products in a chemical reaction is as follows: three chemical reactions are totally generated, the longest synthetic route is a byproduct 3, two steps are totally performed, all possible reactions of the first four raw materials (including reagents, solvents, compounds 1 and 2) are calculated during operation to obtain products, then the products 1 and 2 are selected through rationality judgment and molecular weight judgment (accuracy judgment), the products 1 and 2 are used for calculating with the first four raw materials again to obtain products, then rationality judgment and molecular weight judgment (accuracy judgment) are performed, and finally the results under each molecular weight are sequenced to obtain the products 1, 2 and 3; as shown in fig. 3.
Example 1
In the synthesis of intermediate 1 of olmesartan:
the products automatically predicted by molecular weight are shown in FIG. 4.
Example 2
When the apixaban bulk drug is synthesized:
the predicted reaction results based on molecular weight are shown in FIG. 5.
In conclusion, the prediction method of the chemical reaction product provided by the invention can quickly and fully automatically calculate all products in the chemical reaction; meanwhile, the structure of the product is accurately judged according to the molecular weight; and the synthesis paths of all products can be given while all products are predicted, so that the scientific research personnel can conveniently refer to the method.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for predicting a chemical reaction product, comprising the steps of:
a) establishing a database containing chemical reaction literature, calculating all reactions possibly occurring in a single step between initial molecules according to the chemical reactions recorded in the literature, and performing chemical structure rationality judgment and chemical structure accuracy judgment on reaction products to obtain screened products;
b) performing multiple rounds of operation on the screened product obtained in the step a) and all initial compounds to obtain all possible reaction products in the system in each single-step chemical reaction generation process, and finally screening the product structure under each molecular weight to find out the most possible structure.
2. The prediction method according to claim 1, wherein the process of creating the database containing the chemical reaction literature in step a) is specifically:
a1) storing the collected various chemical reactions in advance through a computer, converting the chemical reactions into a computer storage format, and then performing data processing to obtain one-to-one correspondence information of atoms in reactants and products;
a2) according to the corresponding information, identifying the reaction sites, and extracting the atoms, chemical bonds and groups which are directly connected with or conjugated with the reaction sites and the groups which are indirectly connected with the reaction sites and influence the reaction as the information for identifying the chemical reaction, and further storing the information in a database as the reaction rule.
3. The prediction method according to claim 2, wherein the various chemical reactions in step a1) include conventional chemical reactions, classical organic human name reactions, chemical reactions reported in academic journals, and chemical reactions reported in patents.
4. The prediction method according to claim 2, wherein the reactive sites in step a2) comprise altered chemical bonds and atoms directly connected to these bonds;
the identification process specifically comprises the following steps:
by comparing the chemical structures of the starting materials and the products in the reaction, the altered chemical bonds and the atoms to which these bonds are directly attached are found.
5. The prediction method according to claim 1, wherein the criterion for the rationality judgment of the chemical structure in step a) comprises:
removing compounds containing unreasonable chemical structures; removing the product structure with the increased chemical element types after reaction; removing structures that have already appeared; products that do not match the reaction conditions of the reaction fingerprint derived literature chemical reaction at all are removed.
6. The prediction method according to claim 5, wherein the process of determining the rationality of the chemical structure further comprises:
sequencing results of products meeting the standard according to the difficulty degree of reaction of the products according to the reaction rule; the difficulty degree of the reaction is determined according to the frequency of the literature corresponding to the chemical reaction rule in the database.
7. The prediction method according to claim 1, wherein the process of determining the accuracy of the chemical structure in step a) is specifically:
establishing an easily-ionized group structure which possibly influences the LCMS peak generation of the compound and an ionized group database of the influence of an adducted ion peak on the molecular weight of the compound according to chemical reaction literature information;
and then inputting all the information of the compounds to be reacted, and molecular weight data obtained by detecting a product reaction liquid or a product crude product obtained after the chemical reaction operation in a liquid phase mass spectrometer, calculating the molecular weight of the obtained product data, matching the product structure with a group structure in an ionized group database, simultaneously calculating the molecular weight of the compounds, comparing the chemical reaction product data obtained after the calculation with the molecular weight information obtained by detecting the reaction liquid after the actual chemical reaction, and finding out the compounds with consistent molecular weights, namely the screened products.
8. The prediction method according to claim 1, wherein the multiple rounds of operations in step b) are specifically performed by:
b1) carrying out one operation on each screened product and all initial compounds to obtain a new reaction product;
b2) repeating the operation process until all the possible reaction products are obtained in the system during each single-step chemical reaction.
9. The prediction method according to claim 8, wherein the operation in step b1) is specifically:
and according to the chemical reaction rule, after the product is calculated from the raw materials, screening and sequencing the product results.
10. The prediction method according to claim 1, wherein the screening in step b) is specifically performed by:
sequencing results of the product structures under each molecular weight according to the reaction rule and the difficulty degree of reaction; the difficulty degree of the reaction is determined according to the frequency of the literature corresponding to the chemical reaction rule in the database.
CN202011302735.1A 2020-11-19 2020-11-19 Prediction method of chemical reaction product Pending CN114550841A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011302735.1A CN114550841A (en) 2020-11-19 2020-11-19 Prediction method of chemical reaction product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011302735.1A CN114550841A (en) 2020-11-19 2020-11-19 Prediction method of chemical reaction product

Publications (1)

Publication Number Publication Date
CN114550841A true CN114550841A (en) 2022-05-27

Family

ID=81659239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011302735.1A Pending CN114550841A (en) 2020-11-19 2020-11-19 Prediction method of chemical reaction product

Country Status (1)

Country Link
CN (1) CN114550841A (en)

Similar Documents

Publication Publication Date Title
Elyashberg et al. Contemporary computer-assisted approaches to molecular structure elucidation
US8916818B2 (en) Chromatograph tandem quadrupole mass spectrometer
CN104897817B (en) Chromatograph and the method reusing chromatographic column
US7763846B2 (en) Method of analyzing mass analysis data and apparatus for the method
Blinov et al. Computer‐assisted structure elucidation of natural products with limited 2D NMR data: application of the StrucEluc system
Moretti et al. Measuring the signal strength in tt¯ H with H→ bb¯
WO2013022771A1 (en) Chemical identification using a chromatography retention index
Stepišnik et al. A comprehensive comparison of molecular feature representations for use in predictive modeling
Barone et al. Benchmark Structures and Conformational Landscapes of Amino Acids in the Gas Phase: a Joint Venture of Machine Learning, Quantum Chemistry, and Rotational Spectroscopy
EP2749876A1 (en) Method and device for analyzing mass analysis data
CN111899801A (en) Reactant concentration calculation method, device and system and storage medium
Kreutter et al. Multistep retrosynthesis combining a disconnection aware triple transformer loop with a route penalty score guided tree search
CN114550841A (en) Prediction method of chemical reaction product
Unsleber Accelerating Reaction Network Explorations with Automated Reaction Template Extraction and Application
JP2001521943A (en) Substantial search method for analogs of lead compounds
JP2011209062A (en) Secondary analysis method of mass spectrum data, and secondary analysis system of the same
JPH09257780A (en) Data processing apparatus of chromatography/mass analyser
US11094399B2 (en) Method, system and program for analyzing mass spectrometoric data
JP6295910B2 (en) Mass spectrometry data processor
JP2013057695A (en) Mass spectrometric data analysis method
Ludwig et al. Finding characteristic substructures for metabolite classes
Dubois Computer assisted modelling of reactions and reactivity
CN113552091B (en) High-purity 2, 6-xylenol near infrared spectrum on-line detection method
WO2023193259A1 (en) Multi-model ensemble learning-based method for improving confidence of retrosynthesis
US10832800B2 (en) Synthetic pathway engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination