CN114974450B - Method for generating operation steps based on machine learning and automatic test device - Google Patents

Method for generating operation steps based on machine learning and automatic test device Download PDF

Info

Publication number
CN114974450B
CN114974450B CN202210741235.0A CN202210741235A CN114974450B CN 114974450 B CN114974450 B CN 114974450B CN 202210741235 A CN202210741235 A CN 202210741235A CN 114974450 B CN114974450 B CN 114974450B
Authority
CN
China
Prior art keywords
chemical reaction
substance
operation steps
molecular
reaction formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210741235.0A
Other languages
Chinese (zh)
Other versions
CN114974450A (en
Inventor
吴海超
曾琢
吴静巍
陆文洋
公维博
杨承颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Woshi Digital Technology Co ltd
Original Assignee
Suzhou Woshi Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Woshi Digital Technology Co ltd filed Critical Suzhou Woshi Digital Technology Co ltd
Priority to CN202210741235.0A priority Critical patent/CN114974450B/en
Publication of CN114974450A publication Critical patent/CN114974450A/en
Application granted granted Critical
Publication of CN114974450B publication Critical patent/CN114974450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/10Analysis or design of chemical reactions, syntheses or processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The present disclosure relates to a method of generating operational steps based on machine learning and automated test equipment. Comprising the following steps: acquiring a molecular diagram feature matrix, a molecular diagram adjacent matrix and molecular fingerprint features of each substance in a chemical reaction formula; and inputting the molecular diagram feature matrix and the molecular diagram adjacent matrix into an operation step prediction model to obtain an operation step corresponding to the chemical reaction formula. According to the embodiment of the disclosure, the operation steps corresponding to the chemical reaction can be automatically predicted, and further, the disclosure also develops an optimization system and a laboratory automation platform, so that the initial prediction can be realized through a machine learning algorithm, the laboratory automation platform is verified, and the machine learning algorithm is used for further optimizing the self-circulation prediction optimization system. The accidental of manual testing is reduced, and the accuracy of operation steps is improved.

Description

Method for generating operation steps based on machine learning and automatic test device
Technical Field
The disclosure relates to the technical field of artificial intelligence, in particular to a method, a device and a system for generating chemical reaction operation steps based on a machine learning and automation test device.
Background
In the formation of the target compound, not only the reactants, reaction conditions (e.g., catalyst, reaction temperature, reaction time, etc.), reaction paths (e.g., reaction of reactant a and reactant B under reaction condition C to form intermediate D, reaction of intermediate D and reactant E under reaction condition F to form product G) but also the chemical reaction operation steps (e.g., feeding, reaction, stirring, filtration, crystallization, etc.) are required to be clarified. In the related art, the determination of the operation steps of the chemical reaction is mostly realized manually, the manual combination of literature and self experience is adopted to determine the operation steps through continuous trial and error experiments, so that the method is time-consuming and labor-consuming, and the test is extremely accidental.
Disclosure of Invention
To overcome at least one of the problems associated with the related art, the present disclosure provides a method, apparatus, and system for generating chemical reaction operating steps based on machine learning and automated test equipment.
According to a first aspect of embodiments of the present disclosure, there is provided a method for generating a chemical reaction operation step, including:
acquiring a molecular diagram feature matrix, molecular diagram relation structure information and molecular fingerprint characteristics of each substance in a chemical reaction formula;
inputting the molecular diagram feature matrix and the molecular diagram relation structure information into an operation step prediction model, and outputting an initial operation step corresponding to the chemical reaction formula;
Inputting the molecular fingerprint characteristics into a feeding sequence prediction model, and outputting the feeding sequence of the substances in the chemical reaction formula;
according to the feeding sequence, matching each substance into the initial operation step to obtain an operation step corresponding to the chemical reaction formula;
and screening the multiple groups of operation steps by using an automatic test device and a preset optimization algorithm to determine the optimal operation steps.
In one possible implementation manner, the obtaining a molecular map feature matrix of each substance in the chemical reaction formula includes:
acquiring atomic characteristics and chemical bond characteristics of a molecular diagram of a substance in a chemical reaction formula;
and determining a molecular diagram feature matrix of each substance according to the atomic features and the chemical bond features.
In one possible implementation, obtaining the structural information of the molecular diagram relationship of each substance in the chemical reaction formula includes:
obtaining a connection relation of molecular diagram nodes of substances in a chemical reaction formula;
and determining molecular diagram relation structure information of each substance according to the connection relation.
In one possible implementation manner, the operation step prediction model includes an encoder and a decoder, the inputting the molecular map feature matrix and the molecular map relation structure information into the operation step prediction model, and outputting an initial operation step corresponding to the chemical reaction formula includes:
Inputting the molecular diagram feature matrix and molecular diagram relation structure information of each substance to an encoder, and outputting context vectors of each substance through the encoder;
and inputting the context vector of each substance to the decoder, and outputting the initial operation steps corresponding to the chemical reaction formulas.
In one possible implementation, the substance includes a reactant, a condition, and a product; inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to an encoder, and outputting the context vector of each substance through the encoder, wherein the method comprises the following steps:
respectively acquiring a context vector of the reactant, a context vector of the condition object and a context vector of the product;
adding the context vector of the reactant and the context vector of the condition object to obtain an intermediate vector;
and combining the intermediate vector with the context vector of the product to obtain the context vector of the chemical reaction formula.
In one possible implementation, the decoder includes a GRU network decoder, the inputting the context vector to the decoder, outputting the initial operation step corresponding to the chemical reaction formula includes:
Inputting a word vector of a preset start word symbol and the context vector into a first GRU unit of a GRU network decoder, and outputting a word vector of a first predicted word symbol and a first hidden state, wherein the GRU network decoder comprises a plurality of GRU units;
and inputting the word vector of the first predicted word symbol and the first hidden state into a next GRU unit until the length of the word vector of the inputted preset ending word symbol or the predicted word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
In one possible implementation manner, the training manner of the operation step prediction model includes:
obtaining a first sample set, wherein the first sample set comprises a first chemically reactive sample marked with an operation step type;
obtaining a molecular diagram feature matrix sample and a molecular diagram relation structure information sample of a substance in the first chemical reaction type sample;
inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result;
and iteratively adjusting training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category until the preset requirement is met, so as to obtain the operation step prediction model.
In one possible implementation manner, the training manner of the feeding sequence prediction model includes:
acquiring a second sample set, wherein the second sample set comprises a second chemical reaction type sample, and substances in the second chemical reaction type sample are marked with a feeding sequence;
obtaining a molecular fingerprint characteristic sample of a substance in the second chemical reaction formula sample;
inputting the molecular fingerprint characteristic sample into an initial feeding sequence prediction model to generate a prediction result;
and iteratively adjusting training parameters in the initial feeding sequence prediction model based on the difference between the prediction result and the marked feeding sequence until the preset requirement is met, so as to obtain the feeding sequence prediction model.
In one possible implementation manner, in the step of screening the multiple groups of operation steps according to the feeding sequence by using an automatic test device and a preset optimization algorithm, determining an optimal operation step includes:
for each group of operation steps in the multiple groups of operation steps, acquiring a chemical reaction operation of the substances in the chemical reaction formula by an automatic test device according to the group of operation steps to obtain a corresponding reaction result;
And determining the optimal operation steps from the multiple groups of operation steps according to the multiple reaction results and a preset optimization algorithm.
In one possible implementation manner, for each set of operation steps in the multiple sets of operation steps, the automatic test device performs chemical reaction operation on the substances in the chemical reaction formula according to the set of operation steps to obtain corresponding reaction results,
and aiming at each group of operation steps in the multiple groups of operation steps, converting each group of operation steps into instruction information identified by an automatic test device, wherein the instruction information is used for instructing the automatic test device to perform chemical reaction operation on the substances in the chemical reaction formula according to the group of operation steps to obtain a corresponding reaction result.
In one possible implementation manner, the preset optimizing algorithm includes a bayesian optimizing model, the reaction result includes a yield or purity of a product, and the determining, according to a plurality of reaction results and the preset optimizing algorithm, an optimal operation step from the plurality of operation steps includes:
inputting the set of operation steps into an initialized proxy model, and fitting the initialized proxy model to obtain the yield or purity of the product, wherein the proxy model is used for representing prior distribution;
Selecting a group operation step of an alternative test based on the result of the acquisition function on the prior distribution; the acquisition function is used for determining to explore new group operation step combinations or utilizing the group operation steps with the obtained experimental values based on the mean value and the variance given by the initialization agent model;
testing according to the group operation steps of the alternative test by using an automatic testing device to obtain the actual yield or purity of the product;
updating the prior distribution of the proxy model by using the yield or purity of the real product to obtain posterior distribution;
and repeatedly executing the data process of updating the prior distribution by using the posterior distribution data until the real product or purity meets the preset requirement.
According to a second aspect of the embodiments of the present disclosure, there is provided a method and apparatus for generating a chemical reaction operation step, including:
the first acquisition module is used for acquiring a molecular diagram feature matrix, molecular diagram relation structure information and molecular fingerprint characteristics of each substance in the chemical reaction formula;
the first prediction module is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information into an operation step prediction model and outputting an initial operation step corresponding to the chemical reaction formula;
The second prediction module is used for inputting the molecular fingerprint characteristics into a feeding sequence prediction model and outputting the feeding sequence of the substances in the chemical reaction formula;
and the generation module is used for matching each substance into the initial operation steps according to the feeding sequence to obtain the operation steps corresponding to the chemical reaction formulas.
In one possible implementation manner, the acquiring module includes:
the first acquisition submodule is used for acquiring the atomic characteristics and the chemical bond characteristics of the molecular diagram of the substance in the chemical reaction formula;
and the first determining submodule is used for determining a molecular diagram feature matrix of each substance according to the atomic feature and the chemical bond feature.
In one possible implementation manner, the acquiring module includes:
the second acquisition submodule is used for acquiring the connection relation of molecular diagram nodes of substances in the chemical reaction formula;
and the second determining submodule is used for determining molecular diagram relation structure information of each substance according to the connection relation.
In one possible implementation, the operation step prediction model includes an encoder and a decoder, and the first prediction module includes:
the first extraction submodule is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to an encoder, and outputting the context vector of each substance through the encoder;
And the first prediction submodule is used for inputting the context vector of each substance to the decoder and outputting the initial operation step corresponding to the chemical reaction formula.
In one possible implementation, the substance includes a reactant, a condition, and a product; the first extraction submodule includes:
the acquisition unit is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to the encoder, and outputting the context vector of each substance through the encoder;
the processing unit is used for adding the context vector of the reactant and the context vector of the condition object to obtain an intermediate vector;
and the generating unit is used for combining the intermediate vector and the context vector of the product to obtain the context vector of the chemical reaction formula.
In one possible implementation, the decoder includes a GRU network decoder, and the first prediction submodule includes:
the first prediction unit is used for inputting a word vector of a preset start word symbol and the context vector into a first GRU unit of the GRU network decoder, and outputting the word vector of the first prediction word symbol and a first hidden state, wherein the GRU network decoder comprises a plurality of GRU units;
And the second prediction unit is used for inputting the word vector of the first prediction word symbol and the first hidden state into the next GRU unit until the length of the word vector of the input preset end word symbol or the length of the prediction word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
In one possible implementation manner, the system further comprises a first training module, where the first training module includes:
a third acquisition sub-module for acquiring a first set of samples including a first chemically reactive sample labeled with a class of operational steps;
a fourth obtaining submodule, configured to obtain a molecular diagram feature matrix sample and a molecular diagram relationship structure information sample of a substance in the first chemical reaction formula sample;
the second prediction submodule is used for inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result;
the first generation sub-module is used for iteratively adjusting training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category until the preset requirement is met, so as to obtain the operation step prediction model.
In one possible implementation manner, the system further comprises a second training module, where the second training module includes:
a fifth sub-module, configured to obtain a second sample set, where the second sample set includes a second chemically reactive sample, and a material in the second chemically reactive sample is labeled with a feeding sequence;
a sixth sub-module, configured to obtain a molecular fingerprint feature sample of a substance in the second chemically reactive sample;
the third prediction submodule is used for inputting the molecular fingerprint characteristic sample into an initial feeding sequence prediction model to generate a prediction result;
and the second generation sub-module is used for iteratively adjusting training parameters in the initial feeding sequence prediction model based on the difference between the prediction result and the marked feeding sequence until the preset requirement is met, so as to obtain the feeding sequence prediction model.
In one possible implementation, the method further includes:
the second acquisition module is used for acquiring a plurality of groups of operation steps corresponding to the chemical reaction formula;
the third acquisition module is used for acquiring the chemical reaction operation of the substances in the chemical reaction formula by an automatic test device according to each group of operation steps in the plurality of groups of operation steps to obtain a corresponding reaction result;
And the determining module is used for determining the optimal operation steps from the plurality of groups of operation steps according to a plurality of reaction results and a preset optimization algorithm.
In one possible implementation manner, the third obtaining module includes:
and a fifth obtaining sub-module, configured to convert, for each set of operation steps in the multiple sets of operation steps, the each set of operation steps into instruction information identified by an automated test apparatus, where the instruction information is used to instruct the automated test apparatus to perform a chemical reaction operation on a substance in the chemical reaction formula according to the set of operation steps, so as to obtain a corresponding reaction result.
In one possible implementation, the preset optimization algorithm includes a bayesian optimization model, the reaction result includes a yield or purity of the product, and the determining module includes:
a fitting sub-module, configured to input the set of operation steps to an initialized proxy model, and obtain a yield or purity of a product through fitting the initialized proxy model, where the proxy model is used to represent a priori distribution;
a selection sub-module for selecting a group operation step of an alternative test based on the result of the acquisition function on the prior distribution; the acquisition function is used for determining to explore new group operation step combinations or utilizing the group operation steps with the obtained experimental values based on the mean value and the variance given by the initialization agent model;
The test submodule is used for testing according to the group operation steps of the alternative test by utilizing an automatic test device to obtain the real yield or purity of the product;
an updating sub-module for updating the prior distribution of the proxy model by using the yield or purity of the real product to obtain posterior distribution;
and the iteration submodule is used for repeatedly executing the data process of updating the prior distribution by using the posterior distribution data until the real product or purity meets the preset requirement.
According to a third aspect of embodiments of the present disclosure, there is provided a generation system of chemical reaction operating steps, comprising:
the automatic test device comprises screening equipment, a reactor and detection equipment which are electrically connected;
an electronic device, comprising:
a processor, a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of generating the chemical reaction operating step of any one of the embodiments of the present disclosure.
According to a fourth aspect of embodiments of the present disclosure, there is provided a generation apparatus of a chemical reaction operation step, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of the chemical reaction operating step of any of the embodiments of the present disclosure.
According to a fifth aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium, which when executed by a processor, causes the processor to perform a method of chemically reacting operational steps according to any one of the embodiments of the present disclosure.
According to a sixth aspect of embodiments of the present disclosure, there is provided a computer program product comprising instructions therein, which when executed by a processor of an electronic device, enable the electronic device to perform the method of the chemical reaction operating step as set forth in any one of the embodiments of the present disclosure.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects: according to the embodiment of the disclosure, the structural characteristics of each substance can be accurately reflected by constructing the molecular diagram characteristic matrix, the molecular diagram relation structure information and the molecular fingerprint matrix of each substance in the chemical reaction formula. And matching each substance into the initial operation steps output by the operation step prediction model according to the feeding sequence output by the feeding sequence model by combining the operation step prediction model and the feeding sequence prediction model to obtain the operation steps corresponding to the chemical reaction formulas. According to the embodiment of the disclosure, the operation steps corresponding to the chemical reaction can be automatically predicted, the accidental of manual testing is reduced, and the accuracy of the operation steps is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart illustrating a method of generating a chemical reaction operating step according to an exemplary embodiment.
Fig. 2 is a flow chart illustrating a method of generating a chemical reaction operating step according to another exemplary embodiment.
Fig. 3 is a flow chart illustrating a method of generating a chemical reaction operating step according to another exemplary embodiment.
FIG. 4 is a schematic diagram of a production system illustrating one chemical reaction operating step, according to one exemplary embodiment.
Fig. 5 is a flow chart illustrating a method of generating a chemical reaction operating step according to another exemplary embodiment.
Fig. 6 is a schematic block diagram of a generating apparatus showing a chemical reaction operation step according to an exemplary embodiment.
Fig. 7 is a schematic block diagram of a generating apparatus showing a chemical reaction operation step according to another exemplary embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
In order to facilitate understanding of the technical solutions provided by the embodiments of the present disclosure by those skilled in the art, a technical environment in which the technical solutions are implemented is described below.
With the development of artificial intelligence technology, a technology for performing chemical reactions using a machine learning method has emerged. In the related art, the generation of reaction paths and reaction conditions is focused on, and chemical reaction operation steps are generated by a machine learning method, but the recommended operation steps are not high in accuracy, and when the sequence of reactant input models is changed, different operation steps are generated, so that basic chemical rules are obviously violated.
Based on actual technical needs similar to those described above, the present disclosure provides a method, apparatus and system for generating a chemical reaction operation step.
The method of forming the chemical reaction operating steps described in the present disclosure will be described in detail with reference to fig. 1. FIG. 1 is a method flow diagram of one embodiment of a method of generating a chemical reaction operating step provided by the present disclosure. Although the present disclosure provides method operational steps as illustrated in the following examples or figures, more or fewer operational steps may be included in the method, either on a routine or non-inventive basis. In steps where there is logically no necessary causal relationship, the order of execution of the steps is not limited to the order of execution provided by the embodiments of the present disclosure.
Specifically, as shown in fig. 1, an embodiment of a method for generating a chemical reaction operation step provided in the present disclosure may be applied to a terminal or a server, and includes:
step S101, a molecular diagram feature matrix, molecular diagram relation structure information and molecular fingerprint characteristics of each substance in the chemical reaction formula are obtained.
In embodiments of the present disclosure, the species in the chemical reaction formula may include reactants, conditions, products, and the like. The molecular diagram includes a diagram that may represent a molecular structural property, a node in the molecular diagram represents an atom of a molecule, an edge in the molecular diagram represents a chemical bond feature, and the like. In one example, a molecular map feature matrix may be constructed from atomic features of the molecular map. The atomic characteristics of the molecular map include, but are not limited to, at least one of the following: atomic symbols, formal charge, number of linkages, mode of hybridization, number of hydrogens, total valence, chirality, hydrogen bond donor, hydrogen bond acceptor, whether within an aromatic hydrocarbon, whether in a ring, in an n-membered ring. In another example, the atomic features of the molecular map may also be combined with the chemical bond features of the molecular map to construct a molecular map feature matrix. The chemical bond characteristics of the molecular diagram include, but are not limited to, at least one of the following: the type of bond, whether in the ring, whether conjugated, cis-isomerism, trans-isomerism. Specific construction modes can include: for example, a substance includes N nodes (atoms), each node having its own atomic and chemical bond characteristics. And forming an NxD characteristic matrix X according to the characteristics of the N nodes, wherein the D dimension comprises characteristic dimensions of atomic characteristics and chemical bond characteristics.
In the embodiment of the disclosure, the molecular graph relation structure information may represent the connection relation of the node in the molecule, and represents the position of the node in the whole molecular graph. The molecular diagram relational structure information is used for describing the molecular diagram structure and can comprise a molecular diagram adjacency matrix and other expressions for describing the form of the molecular diagram structure. In one example, an identifier may be set for an atom in a molecule of a substance, and according to a connection relationship between a start atom identifier and a termination atom identifier with other atoms, for example, a connection is represented as 1, and an unconnected connection is represented as 0, so as to construct molecular diagram relationship structure information. In an embodiment of the disclosure, the fingerprint feature of the molecule is used to represent an abstract feature of the molecule, and the structural feature of the molecule is extracted first and then encoded into a series of bit vectors. The fingerprint features of the molecules may include Morgan fingerprint features, extended Connectivity Fingerprints (ECFPs), and the like. It should be noted that, the setting manner of the fingerprint feature is not limited to the above example, for example, the structural feature of the molecule is extracted through the graph convolution neural network, and the feature vector is obtained through the hash algorithm, and those skilled in the art may make other changes in light of the technical spirit of the present application, but as long as the implemented function and effect are the same or similar to those of the present application, all the changes should be covered in the protection scope of the present application.
Step S103, inputting the molecular diagram feature matrix and the molecular diagram relation structure information into an operation step prediction model, and outputting an initial operation step corresponding to the chemical reaction formula.
In an embodiment of the disclosure, the operation step prediction model may include an artificial neural network model based on machine learning, such as a graph roll-up neural network, a cyclic neural network (RNN), a long-short memory network (LSTM), and the like, and may further include a derivative network of the above network, such as a cyclic neural network composed of gate control units (GRUs). And obtaining a training operation step prediction model when the training iteration times or the difference value between the predicted value and the real value meets the preset requirement by adopting a supervised, self-supervised or unsupervised training mode. In embodiments of the present disclosure, the initial operating step may include a predetermined chemical reaction operating step, such as feeding, reacting, stirring, filtering, crystallizing, and the like. It should be noted that, the setting manner of the initial operation step is not limited to the above examples, for example, step types may be increased or decreased as required, for example, dissolution, extraction, etc., and other modifications may be made by those skilled in the art in light of the technical spirit of the present application, but all the functions and effects achieved by the method are included in the protection scope of the present application as long as they are the same or similar to the present application. In the embodiment of the disclosure, the operation step prediction model inputs the molecular diagram feature matrix and the molecular diagram relation structure information, and outputs the initial operation step corresponding to the chemical reaction formula.
Step S105, inputting the molecular fingerprint characteristics into a feeding sequence prediction model, and outputting the feeding sequence of the substances in the chemical reaction formula.
In an embodiment of the disclosure, the feeding sequence prediction model may include an artificial neural network model based on machine learning, such as a convolutional neural network, a cyclic neural network, a Support Vector Machine (SVM), and the like, and may further include a derivative network of the above network, such as Support Vector Regression (SVR). And obtaining a training operation step prediction model when the training iteration times or the difference value between the predicted value and the real value meets the preset requirement by adopting a supervised, self-supervised or unsupervised training mode. In the embodiment of the disclosure, the feeding sequence of the substance includes the feeding sequence of the substance in a chemical reaction, for example, the feeding sequence of the reactant a, the reactant B, the reactant C, and the conditional D in a certain chemical reaction formula is: reactant a-reactant C-condition D-reactant B, the chemical equation predicts that the material loading sequence may include 1-4-2-3.
And step S107, matching each substance into the initial operation steps according to the feeding sequence, and obtaining the operation steps corresponding to the chemical reaction formulas.
Referring to FIG. 2, a batch sequence prediction model 200 inputs each substance 201 in a chemical reaction formula and outputs a batch sequence of the substance, for example, na + H - The feeding sequence of (C) is 4, H 2 The order of addition of O was 5. The initial operation step 203 of the operation step prediction model output includes: dissolving, feeding, stirring, feeding, extracting, drying, concentrating, purifying and the like. The substances may be matched sequentially in the initial operation step of presetting the addable substances according to the feeding order of the substances. For example, the initial operating steps are: dissolving and feeding, wherein substances can be added at the back, and the initial operation steps are as follows: stirring, no material can be added later. Thus, matching the substance 204 to the first dissolution; matching DMF to the second dissolution; 1-The octyl bromide is matched with the first feeding part; matching the sodium hydride to a second feed; water was matched to the third addition.
According to the embodiment of the disclosure, the structural characteristics of each substance can be accurately reflected by constructing the molecular diagram characteristic matrix, the molecular diagram relation structure information and the molecular fingerprint matrix of each substance in the chemical reaction formula. And matching each substance into the initial operation steps output by the operation step prediction model according to the feeding sequence output by the feeding sequence model by combining the operation step prediction model and the feeding sequence prediction model to obtain the operation steps corresponding to the chemical reaction formulas. According to the embodiment of the disclosure, the structural characteristics of the substance are utilized in prediction, and the contextual characteristics of the substance do not need to be extracted, so that uniform operation steps can be obtained under the condition that the sequence of the input of the model to the substance is changed, and the accuracy of the operation steps is high.
In one possible implementation manner, the obtaining a molecular map feature matrix of each substance in the chemical reaction formula includes:
acquiring atomic characteristics and chemical bond characteristics of a molecular diagram of a substance in a chemical reaction formula;
and determining a molecular diagram feature matrix of each substance according to the atomic features and the chemical bond features.
In an embodiment of the present disclosure, the atomic features of the molecular map include, but are not limited to, at least one of: atomic symbols, formal charge, number of linkages, mode of hybridization, number of hydrogens, total valence, chirality, hydrogen bond donor, hydrogen bond acceptor, whether within an aromatic hydrocarbon, whether in a ring, in an n-membered ring. The chemical bond characteristics of the molecular diagram include, but are not limited to, at least one of the following: the type of bond, whether in the ring, whether conjugated, cis-isomerism, trans-isomerism. In one example, feature combinations may be performed by screening out a number of atomic features from the atomic features of the molecular map and a number of chemical bond features from the chemical bond features based on the predictive effect of the model. Specific construction modes can include: for example, a substance includes N nodes (atoms), each node having its own atomic and chemical bond characteristics. And forming an NxD characteristic matrix X according to the characteristics of the N nodes, wherein the D dimension comprises characteristic dimensions of atomic characteristics and chemical bond characteristics. Such as atomic features and chemical bond features of a first atom in a first behavior molecular diagram, atomic features and chemical bond features of a second atom in a second behavior molecular diagram, and so on, atomic features and chemical bond features of an nth atom of an nth behavior. In the embodiment of the disclosure, the line sequence of the atoms is not limited, for example, the first atom may be located in the first line of the feature matrix of the molecular map, or may be located in another line, and likewise the second atom may be located in the first line of the feature matrix of the molecular map, or may be located in another line, as long as the features of the two atoms do not overlap with a certain line.
When the molecular diagram feature matrix is constructed, the molecular diagram atomic feature and the chemical bond feature are combined, so that more-dimensional molecular feature information can be contained, and the operation steps corresponding to the chemical reaction formula can be obtained more accurately.
In one possible implementation, obtaining the structural information of the molecular diagram relationship of each substance in the chemical reaction formula includes:
obtaining a connection relation of molecular diagram nodes of substances in a chemical reaction formula;
and determining molecular diagram relation structure information of each substance according to the connection relation.
In the embodiment of the disclosure, the molecular diagram relation structure information may represent the connection relation of the node in the molecule, and represents the position of the node (atom) in the whole molecular diagram. The connection relation of the molecular diagram nodes of the substances in the chemical reaction formula can be expressed as 1 when the substances are connected, and as 0 when the substances are not connected, so that the molecular diagram relation structure information can be constructed. For example, if N nodes are included in the molecular diagram, an n×n adjacency matrix a can be constructed, and the adjacency matrix a can be expressed as the following formula (1). Wherein the matrix A corresponds to a molecule having 2 atoms, and the row number and the column number are a first atom, a second atom, and a third atom. a, a 12 1 represents a first atom linked to a second atom, a 13 0 represents that the first atom is not linked to the third atom, a 21 Is 1 represents a second atom and a first atomConnected, and so on.
Figure BDA0003718077720000111
The embodiment of the disclosure provides a construction method of molecular diagram relation structure information, which can conveniently and rapidly construct the molecular diagram relation structure information, thereby being beneficial to expressing the connection information of molecules.
In one possible implementation manner, the operation step prediction model includes an encoder and a decoder, the inputting the molecular map feature matrix and the molecular map relation structure information into the operation step prediction model, and outputting an initial operation step corresponding to the chemical reaction formula includes:
inputting the molecular diagram feature matrix and molecular diagram relation structure information of each substance to an encoder, and outputting context vectors of each substance through the encoder;
and inputting the context vector to the decoder, and outputting an initial operation step corresponding to the chemical reaction formula.
Referring to fig. 3, a user may input the or molecular structure of each substance in chemical reaction 301, and the system automatically converts to its encoded form, such as SMILES (Simplified Molecular Input Line Entry Specification, simplified molecular linear input specification). Of course, the user may also directly input the encoded forms of the various substances in chemical equation 301 for easy computer recognition. A feature set 302 of each substance is constructed, including a molecular map feature matrix, molecular map relational structure information, and molecular fingerprint features. Through encoder 303 (including various convolutional neural networks, such as a graph convolutional neural network). After the molecular diagram feature matrix (N multiplied by D) and the molecular diagram relation structure information (N multiplied by N) are input into the diagram convolution neural network, the nodes are updated in a message transmission mode, and each update can be understood as that the molecular diagram feature matrix and the molecular diagram relation structure information are subjected to matrix multiplication to obtain an N multiplied by D matrix, and each row is a D-dimensional vector and represents the Embedding feature of the node. The N x D matrix obtains a contextual eigenvector (1 x D) of the substance by means of aggregation (addition/average/weighted addition, etc.) or neural network in the first dimension (e.g. set2set, prior art).
In the embodiment of the disclosure, the decoder may include an artificial neural network model based on machine learning, including but not limited to an autoregressive cyclic neural network composed of GRU, LSTM, self-Attention networks, and the decoder is obtained when the number of training iterations or the difference between the predicted value and the actual value meets the preset requirement by adopting a supervised, self-supervised or unsupervised training mode. Taking a cyclic neural network composed of gate control units (GRUs) as an example, if the length of the context vector 304 is n, the context vector is composed of sequences [ x1, x2, and xn ], the first word x1 of each sequence is set as a START word (START Token), the last word xn is set as an END word (END Token), the word Embedding feature vector (embedded) of each word is randomly initialized during training, and the trained word Embedding feature vector (embedded) is used for calculation during prediction. In the loop generation, the word embedding feature vector of x1 predicted at the time of x1 and the Hidden State (Hidden State) of the Hidden layer output thereof are taken as input of x2, and the like, and the operation is stopped until the output is an END Token or a specified maximum sequence length. Wherein the START character (START Token) plays an active role for predicting the actual first word in the sequence. The END Token (END Token) functions to determine the END of the sequence and stop the loop. The maximum length of the sequence is set, and the loop may be stopped when the END Token (END Token) cannot be predicted.
In an embodiment of the disclosure, the operation step prediction model includes an encoder and a decoder, where the encoder is configured to extract a context vector of each substance of the chemical reaction formula, so that the context vector includes both structural feature information of the molecule and positional feature information of the molecule. By inputting the context vector into the decoder, a more accurate prediction result can be output.
In one possible implementation, the substance includes a reactant, a condition, and a product; inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to an encoder, and outputting the context vector of each substance through the encoder, wherein the method comprises the following steps:
respectively acquiring a context vector of the reactant, a context vector of the condition object and a context vector of the product;
adding the context vector of the reactant and the context vector of the condition object to obtain an intermediate vector;
and combining the intermediate vector with the context vector of the product to obtain the context vector of the chemical reaction formula.
In the disclosed embodiment, context vectors of each substance are obtained, including a reactant context vector, a condition context vector, and a product context vector, respectively. The context vector of the reactant is added to the context vector of the condition to obtain an intermediate vector, such as a feature vector expressed as 1×h. The context vector of the product, such as a feature vector expressed as 1×h. And merging (splicing) the intermediate vector and the feature vector of the product to obtain a context vector of a chemical reaction formula, and obtaining a context vector of 1 multiplied by 2H. Where H is the custom encoded feature dimension, e.g., 128-dimension, 256-dimension, etc.
In an embodiment of the disclosure, a method for generating a context vector corresponding to a chemical reaction formula is provided, wherein the context vector of a reactant and the context vector of a product are added, and then spliced with the context vector of the product. Considering that the conditions are closer to the reactants than to the products, they are all substances before the reaction. Therefore, the two materials are added and then spliced with the product, so that the characteristic information of each material in the chemical reaction formula is reserved, and the dimension of the characteristic vector is reduced.
In one possible implementation, the decoder includes a GRU network decoder, the inputting the context vector to the decoder, outputting the initial operation step corresponding to the chemical reaction formula includes:
inputting a word vector of a preset start word symbol and the context vector into a first GRU unit of a GRU network decoder, and outputting a word vector of a first predicted word symbol and a first hidden state, wherein the GRU network decoder comprises a plurality of GRU units;
and inputting the word vector of the first predicted word symbol and the first hidden state into a next GRU unit until the length of the word vector of the inputted preset ending word symbol or the predicted word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
In the embodiment of the disclosure, the preset beginning phrase plays an activating role to predict the first phrase of the initial operation step. The context feature vector is used as the hidden state of the first GRU unit, the word vector of the preset beginning word symbol and the context vector are input to the first GRU unit of the GRU network decoder, and the word vector of the first predicted word symbol and the first hidden state are output; inputting the word vector and the first hidden state of the first predicted word symbol to a second GRU unit, and outputting the word vector and the second hidden state of the second predicted word symbol; and so on until the length of the preset end word symbol or the predicted word symbol output by the Nth GRU unit reaches a preset value, and stopping circulation. By setting the length of the predicted word, the loop can be stopped even when the ending word cannot be predicted.
According to the embodiment of the disclosure, the decoder is constructed through the GRU network, and the GRU decoder has the beneficial effects of simple structure and high convergence.
In one possible implementation manner, the training manner of the operation step prediction model includes:
obtaining a first sample set, wherein the first sample set comprises a first chemically reactive sample marked with an operation step type;
Obtaining a molecular diagram feature matrix sample and a molecular diagram relation structure information sample of a substance in the first chemical reaction type sample;
inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result;
and iteratively adjusting training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category until the preset requirement is met, so as to obtain the operation step prediction model.
In embodiments of the present disclosure, the operation step prediction model may employ a supervised machine learning approach. The method specifically comprises the following steps: a first set of samples is obtained, the first set of samples including a first chemically reactive sample labeled with a class of operational steps. Wherein the class of operating steps may include, but is not limited to, feeding, reacting, stirring, filtering, and crystallizing. In an embodiment of the present disclosure, a molecular map feature matrix sample and a molecular map relationship structure information sample of a substance in the first chemically reactive sample may be obtained by using any one of the methods in the foregoing embodiments. And inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result. The initial operation step prediction model may include a graph roll-up neural network, a cyclic neural network (RNN), a long and short memory network (LSTM), and the like, and may also include derivative networks of the above networks, such as a cyclic neural network composed of gate control units (GRUs). Training parameters are set in the initial operation step prediction model. The predicted result may include predicted initial operating steps such as feeding, reacting, stirring, filtering, crystallizing, and the like. And carrying out iterative adjustment on training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category, wherein the iterative times meet the preset requirement or the difference meets the preset requirement, so as to obtain the operation step prediction model.
According to the embodiment of the disclosure, the operation step prediction model is trained by a deep learning method, and the model is based on a molecular diagram feature matrix sample and a molecular diagram relation structure information sample of substances, and the samples can accurately reflect the structural features of the substances, so that the obtained operation step prediction model is more accurate in prediction result.
In one possible implementation manner, the training manner of the feeding sequence prediction model includes:
acquiring a second sample set, wherein the second sample set comprises a second chemical reaction type sample, and substances in the second chemical reaction type sample are marked with a feeding sequence;
obtaining a molecular fingerprint characteristic sample of a substance in the second chemical reaction formula sample;
inputting the molecular fingerprint characteristic sample into an initial feeding sequence prediction model to generate a prediction result;
and iteratively adjusting training parameters in the initial feeding sequence prediction model based on the difference between the prediction result and the marked feeding sequence until the preset requirement is met, so as to obtain the feeding sequence prediction model.
In embodiments of the present disclosure, the batch order prediction model may employ a supervised machine learning approach. The method specifically comprises the following steps: a second sample set is obtained, wherein the second sample set comprises a second chemical reaction type sample, and substances in the second chemical reaction type sample are marked with a feeding sequence. Wherein, the feeding sequence may include: 1. 2, 3, 4, … or A, B, C, D …. In an embodiment of the disclosure, the molecular fingerprint feature sample of the substance in the second chemically reactive sample may be used in any of the methods in the embodiments above. And inputting the molecular fingerprint characteristic sample into an initial feeding sequence prediction model to generate a prediction result. The feeding sequence prediction model may include a convolutional neural network, a cyclic neural network, a Support Vector Machine (SVM), and the like, and may also include a derivative network of the above network, such as Support Vector Regression (SVR). Training parameters are set in the feeding sequence prediction model. The prediction may include predicting a dosing sequence, such as 1, 2, 3, 4 …, or A, B, C, D …. And carrying out iterative adjustment on training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category, wherein the iterative times meet the preset requirement or the difference meets the preset requirement, so as to obtain the operation step prediction model.
According to the embodiment of the disclosure, the feeding sequence prediction model is trained by a deep learning method, and the model is based on fingerprint sign samples of substances, and the samples can accurately reflect the characteristics of the substances, so that the obtained feeding sequence prediction model result is more accurate.
In a possible implementation manner, after the matching the substances to the initial operation steps according to the feeding sequence, the operation steps corresponding to the chemical reaction formula are obtained, the method further includes:
obtaining a plurality of groups of operation steps corresponding to the chemical reaction formulas;
for each group of operation steps in the multiple groups of operation steps, acquiring a chemical reaction operation of the substances in the chemical reaction formula by an automatic test device according to the group of operation steps to obtain a corresponding reaction result;
and determining the optimal operation steps from the multiple groups of operation steps according to the multiple reaction results and a preset optimization algorithm.
In the embodiment of the disclosure, the operation step of acquiring multiple groups of predictions by Beam set search (Beam search) can be adopted, and the number of groups is determined by setting Beam width (Beam size). In one example, the n groups of operation steps select m groups of operation steps (n > m) with highest probability, the selected m groups of operation steps are converted into communication parameters capable of being communicated with an automatic test device according to a predefined rule, and chemical reaction operation is performed by the automatic test device according to each group of operation steps, so that a corresponding reaction result is obtained. The reaction results include, but are not limited to, the yield of the product in the chemical reaction formula, the purity of the product, and the like. Fitting the data of the batch of experiments according to the reaction results, selecting the next group of operation steps from n-m groups of operation steps, and iterating until the results reach the preset requirements. For a plurality of reaction results, a preset optimization algorithm, such as a Bayesian algorithm, is adopted to determine the optimal operation steps from a plurality of operation steps. In one example, the operation steps are converted into human-understandable words or phrases, e.g., from a vocabulary, the logograms in the operation steps are converted into original characters, if there is time, temperature, and into a preset interval range. The coded form (e.g., SMILES) of the reactant, the condition, and the product is converted into a name or the like. In embodiments of the present disclosure, the automation may include screening devices, reactors, detection devices, and the like.
In the embodiment of the present disclosure, referring to fig. 4, an automated chemical reaction is performed by an automated test apparatus 401 according to each of a plurality of sets of operation steps, to obtain a plurality of reaction results (experimental results). And according to the reaction result, the preferential screening is carried out on a plurality of groups of operation steps, so that the optimal chemical reaction operation steps can be determined. The accuracy of the generating operation steps is further improved, the labor is liberated, and the degree of automation is improved.
Fig. 5 is a flow chart illustrating a method of generating a chemical reaction operating step according to another exemplary embodiment. Referring to fig. 5, the method includes:
step S101, a molecular diagram feature matrix, molecular diagram relation structure information and molecular fingerprint characteristics of each substance in the chemical reaction formula are obtained.
In embodiments of the present disclosure, the species in the chemical reaction formula may include reactants, conditions, products, and the like. The molecular diagram includes a diagram that may represent a molecular structural property, a node in the molecular diagram represents an atom of a molecule, an edge in the molecular diagram represents a covalent bond feature, and the like. In one example, a molecular map feature matrix may be constructed from atomic features of the molecular map. The atomic characteristics of the molecular map include, but are not limited to, at least one of the following: atomic symbols, formal charge, number of linkages, mode of hybridization, number of hydrogens, total valence, chirality, hydrogen bond donor, hydrogen bond acceptor, whether within an aromatic hydrocarbon, whether in a ring, in an n-membered ring. The fingerprint features of the molecules may include Morgan fingerprint features, extended Connectivity Fingerprints (ECFPs), and the like.
Step S503, inputting the molecular map feature matrix and the molecular map relation structure information of each substance to an encoder, and outputting the context vector of each substance via the encoder.
In the disclosed embodiment, the encoder 303 (including various convolutional neural networks, such as a graph convolutional neural network) is used. After the molecular diagram feature matrix (N multiplied by D) and the molecular diagram relation structure information (N multiplied by N) are input into the diagram convolution neural network, the nodes are updated in a message transmission mode, and each update can be understood as that the molecular diagram feature matrix and the molecular diagram relation structure information are subjected to matrix multiplication to obtain an N multiplied by D matrix, and each row is a D-dimensional vector and represents the Embedding feature of the node. The N x D matrix obtains a contextual eigenvector (1 x D) of the substance by means of aggregation (addition/average/weighted addition, etc.) or neural network in the first dimension (e.g. set2set, prior art).
Step S509, inputting the word vector of the preset start word symbol and the context vector to a first GRU unit of a GRU network decoder, and outputting the word vector of the first predicted word symbol and the first hidden state, wherein the GRU network decoder includes a plurality of GRU units.
Step S511, the word vector of the first predicted word symbol and the first hidden state are input into a next GRU unit until the length of the word vector of the input preset end word symbol or the predicted word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
In the embodiment of the disclosure, the preset beginning phrase plays an activating role to predict the first phrase of the initial operation step. The context feature vector is used as the hidden state of the first GRU unit, the word vector of the preset beginning word symbol and the context vector are input to the first GRU unit of the GRU network decoder, and the word vector of the first predicted word symbol and the first hidden state are output; inputting the word vector and the first hidden state of the first predicted word symbol to a second GRU unit, and outputting the word vector and the second hidden state of the second predicted word symbol; and so on until the length of the preset end word symbol or the predicted word symbol output by the Nth GRU unit reaches a preset value, and stopping circulation. By setting the length of the predicted word, the loop can be stopped even when the ending word cannot be predicted.
Step S513, obtaining a plurality of groups of operation steps corresponding to the chemical reaction formulas.
Step S515, for each set of operation steps in the multiple sets of operation steps, obtaining a corresponding reaction result by performing a chemical reaction operation on the substance in the chemical reaction formula by the automated test apparatus according to the set of operation steps.
Step S517, determining an optimal operation step from the multiple operation steps according to multiple reaction results and a preset optimization algorithm.
In the embodiment of the disclosure, the operation step of acquiring multiple groups of predictions by Beam set search (Beam search) can be adopted, and the number of groups is determined by setting Beam width (Beam size). In one example, the n groups of operation steps select m groups of operation steps (n > m) with highest probability, the selected m groups of operation steps are converted into communication parameters capable of being communicated with an automatic test device according to a predefined rule, and chemical reaction operation is performed by the automatic test device according to each group of operation steps, so that a corresponding reaction result is obtained. The reaction results include, but are not limited to, the yield of the product in the chemical reaction formula, the purity of the product, and the like. Fitting the data of the batch of experiments according to the reaction results, selecting the next group of operation steps from n-m groups of operation steps, and iterating until the results reach the preset requirements.
In one possible implementation manner, for each set of operation steps in the multiple sets of operation steps, the automatic test device performs chemical reaction operation on the substances in the chemical reaction formula according to the set of operation steps to obtain corresponding reaction results,
and aiming at each group of operation steps in the multiple groups of operation steps, converting each group of operation steps into instruction information identified by an automatic test device, wherein the instruction information is used for instructing the automatic test device to perform chemical reaction operation on the substances in the chemical reaction formula according to the group of operation steps to obtain a corresponding reaction result.
In an embodiment of the present disclosure, an automated control platform includes, but is not limited to, a control system developed using LabVIEW. The communication mode between the algorithm and the control system includes, but is not limited to, a communication mode based on a protocol such as HTTP, webSocket. And transmitting the output result to an automatic control platform, and generating a machine instruction corresponding to the operation step by the automatic control platform. The complete chemical reaction conditions (reactants, catalysts, solvents, reagents) and the M operating steps selected by bayesian optimization model are input to the automated control platform of the chemical reaction laboratory. The chemical reaction automation control platform generates the input parameter information into instruction information of the platform, and submits the instruction to the automation experiment platform for experiment; after the automated experiment platform completes the experiment, experimental result information including but not limited to yield, purity and the like is obtained and fed back to the Bayesian optimization model. The Bayesian optimization model uses a proxy model to fit a real experimental result, and uses an acquisition function to select the next group of experimental operation steps.
In one possible implementation manner, the preset optimizing algorithm includes a bayesian optimizing model, the reaction result includes a yield or purity of a product, and the determining, according to a plurality of reaction results and the preset optimizing algorithm, an optimal operation step from the plurality of operation steps includes:
inputting the set of operation steps into an initialized proxy model, and fitting the initialized proxy model to obtain the yield or purity of the product, wherein the proxy model is used for representing prior distribution;
selecting a group operation step of an alternative test based on the result of the acquisition function on the prior distribution; the acquisition function is used for determining to explore new group operation step combinations or utilizing the group operation steps with the obtained experimental values based on the mean value and the variance given by the initialization agent model;
testing according to the group operation steps of the alternative test by using an automatic testing device to obtain the actual yield or purity of the product;
updating the prior distribution of the proxy model by using the yield or purity of the real product to obtain posterior distribution;
and repeatedly executing the data process of updating the prior distribution by using the posterior distribution data until the real product or purity meets the preset requirement.
In the embodiment of the disclosure, the automatic reaction platform mainly comprises four parts, namely an automatic sample injection system for injecting sample into the reactor, an automatic reaction system, an automatic post-processing system and an automatic sampling and detecting system. The automatic control sample injection system is responsible for controlling sample injection to a reaction kettle (a container in which chemical reaction occurs), corresponding pipeline materials are selected according to the sequence of reactants, catalysts, solvents and reagents recommended by a reaction operation step model through a multi-way valve, and are pumped by a syringe pump and pumped into the reaction kettle for reaction; the automatic control reaction system monitors conditions such as real-time temperature and the like through sensors such as temperature and the like, and controls the reaction temperature, the flow rate and the quantity of feeding according to the reaction temperature and the reaction time recommended by the reaction operation step model; the automatic control post-treatment system is operated according to different post-treatment steps such as separation, acid washing, alkali washing and the like recommended in the operation step; finally, the product is passed to a sampler, and the sampler automatically samples and delivers the sample to a detection device consisting of mass spectrum, chromatograph and the like to detect experimental results, and finally, data such as yield, purity and the like can be obtained.
In the disclosed embodiment, initializing a surrogate model representation including a prior distribution with a gaussian process is not limited, and the surrogate model input is characterized by the procedure recommended by the procedure model to fit the yield/purity of the experimentally detected product obtained in S7. The operating steps to be explored for the next batch of experiments are chosen such that the results of the acquisition function on the current a priori distribution are maximized. The function of the acquisition function is to explore new combinations of operating steps based on the mean and variance estimates given by the model, or to exploit combinations of operating steps that have already obtained experimental values. Experimental results the experimental results obtained (yield/purity) were obtained after the experimental results of the operation steps selected in 2. The proxy model prior distribution is updated with the new data resulting in a posterior distribution (this will be the next prior distribution). Repeating the steps and iterating for a plurality of times until reaching the preset yield/purity and stopping iterating.
Fig. 6 is a schematic block diagram illustrating a method of generating a chemical reaction operating step according to an exemplary embodiment. Referring to fig. 6, the apparatus 600 includes:
the first obtaining module 601 is configured to obtain a molecular map feature matrix, molecular map relationship structure information, and molecular fingerprint features of each substance in the chemical reaction formula;
the first prediction module 603 is configured to input the molecular map feature matrix and the molecular map relationship structure information to an operation step prediction model, and output an initial operation step corresponding to the chemical reaction formula;
the second prediction module 605 is configured to input the molecular fingerprint feature into a feeding sequence prediction model, and output a feeding sequence of the substance in the chemical reaction formula;
and a generating module 607, configured to match each substance to the initial operation step according to the feeding sequence, so as to obtain an operation step corresponding to the chemical reaction formula.
In one possible implementation manner, the acquiring module includes:
the first acquisition submodule is used for acquiring the atomic characteristics and the chemical bond characteristics of the molecular diagram of the substance in the chemical reaction formula;
and the first determining submodule is used for determining a molecular diagram feature matrix of each substance according to the atomic feature and the chemical bond feature.
In one possible implementation manner, the acquiring module includes:
the second acquisition submodule is used for acquiring the connection relation of molecular diagram nodes of substances in the chemical reaction formula;
and the second determining submodule is used for determining molecular diagram relation structure information of each substance according to the connection relation.
In one possible implementation, the operation step prediction model includes an encoder and a decoder, and the first prediction module includes:
the first extraction submodule is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to an encoder, and outputting the context vector of each substance through the encoder;
and the first prediction submodule is used for inputting the context vector of each substance to the decoder and outputting the initial operation step corresponding to the chemical reaction formula.
In one possible implementation, the substance includes a reactant, a condition, and a product; the first extraction submodule includes:
the acquisition unit is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to the encoder, and outputting the context vector of each substance through the encoder;
The processing unit is used for adding the context vector of the reactant and the context vector of the condition object to obtain an intermediate vector;
and the generating unit is used for combining the intermediate vector and the context vector of the product to obtain the context vector of the chemical reaction formula.
In one possible implementation, the decoder includes a GRU network decoder, and the first prediction submodule includes:
the first prediction unit is used for inputting a word vector of a preset start word symbol and the context vector into a first GRU unit of the GRU network decoder, and outputting the word vector of the first prediction word symbol and a first hidden state, wherein the GRU network decoder comprises a plurality of GRU units;
and the second prediction unit is used for inputting the word vector of the first prediction word symbol and the first hidden state into the next GRU unit until the length of the word vector of the input preset end word symbol or the length of the prediction word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
In one possible implementation manner, the system further comprises a first training module, where the first training module includes:
a third acquisition sub-module for acquiring a first set of samples including a first chemically reactive sample labeled with a class of operational steps;
A fourth obtaining submodule, configured to obtain a molecular diagram feature matrix sample and a molecular diagram relationship structure information sample of a substance in the first chemical reaction formula sample;
the second prediction submodule is used for inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result;
the first generation sub-module is used for iteratively adjusting training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category until the preset requirement is met, so as to obtain the operation step prediction model.
In one possible implementation manner, the system further comprises a second training module, where the second training module includes:
a fifth sub-module, configured to obtain a second sample set, where the second sample set includes a second chemically reactive sample, and a material in the second chemically reactive sample is labeled with a feeding sequence;
a sixth sub-module, configured to obtain a molecular fingerprint feature sample of a substance in the second chemically reactive sample;
the third prediction submodule is used for inputting the molecular fingerprint characteristic sample into an initial feeding sequence prediction model to generate a prediction result;
And the second generation sub-module is used for iteratively adjusting training parameters in the initial feeding sequence prediction model based on the difference between the prediction result and the marked feeding sequence until the preset requirement is met, so as to obtain the feeding sequence prediction model.
In one possible implementation, the method further includes:
the second acquisition module is used for acquiring a plurality of groups of operation steps corresponding to the chemical reaction formula;
the third acquisition module is used for acquiring the chemical reaction operation of the substances in the chemical reaction formula by an automatic test device according to each group of operation steps in the plurality of groups of operation steps to obtain a corresponding reaction result;
and the determining module is used for determining the optimal operation steps from the plurality of groups of operation steps according to a plurality of reaction results and a preset optimization algorithm.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
In one possible implementation, a system for generating a chemical reaction operating step is provided. Comprising the following steps:
an automated test apparatus comprising an electrically connected screening device, such as a high throughput screening device; reactors, such as tank reactors/continuous flow reactors; and detection devices such as infrared/mass spectrometry detection devices and the like;
An electronic device, comprising:
a processor, a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of generating the chemical reaction operating step according to any one of the embodiments of the present disclosure.
Fig. 7 is a block diagram of a generating apparatus 700 illustrating the operational steps of a chemical reaction according to an exemplary embodiment. For example, the apparatus 700 may be provided as a server. Referring to fig. 7, apparatus 700 includes a processing component 722 that further includes one or more processors and memory resources represented by memory 732 for storing instructions, such as applications, executable by processing component 1922. The application programs stored in memory 732 may include one or more modules that each correspond to a set of instructions. Further, the processing component 722 is configured to execute instructions to perform the methods described above.
The apparatus 700 may further comprise a power component 726 configured to perform power management of the apparatus 700, a wired or wireless network interface 750 configured to connect the apparatus 700 to a network, and an input output (I/O) interface 758. The apparatus 700 may operate based on an operating system stored in memory 732, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as a memory 732, comprising instructions executable by the processing component 722 of the apparatus 700 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, a computer program product is also provided, comprising instructions executable by a processor of the generating device 700 to perform the generating method of the chemical reaction operating steps described above.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (23)

1. A method for generating a chemical reaction operating step based on a machine learning and automation test device, comprising:
acquiring a molecular diagram feature matrix, molecular diagram relation structure information and molecular fingerprint characteristics of each substance in a chemical reaction formula;
inputting the molecular diagram feature matrix and the molecular diagram relation structure information into an operation step prediction model, and outputting an initial operation step corresponding to the chemical reaction formula;
inputting the molecular fingerprint characteristics into a feeding sequence prediction model, and outputting the feeding sequence of the substances in the chemical reaction formula;
according to the feeding sequence, matching each substance into the initial operation steps to obtain a plurality of groups of operation steps corresponding to the chemical reaction formula;
and screening the multiple groups of operation steps by using an automatic test device and a preset optimization algorithm to determine the optimal operation steps.
2. The method of claim 1, wherein the obtaining a molecular map feature matrix for each substance in the chemical reaction formula comprises:
Acquiring atomic characteristics and chemical bond characteristics of a molecular diagram of a substance in a chemical reaction formula;
and determining a molecular diagram feature matrix of each substance according to the atomic features and the chemical bond features.
3. The method of claim 1, wherein obtaining the molecular diagram structural information of each substance in the chemical reaction formula comprises:
obtaining a connection relation of molecular diagram nodes of substances in a chemical reaction formula;
and determining molecular diagram relation structure information of each substance according to the connection relation.
4. The method according to claim 1, wherein the operation step prediction model includes an encoder and a decoder, the inputting the molecular map feature matrix and the molecular map relational structure information into the operation step prediction model, outputting an initial operation step corresponding to the chemical reaction formula, includes:
inputting the molecular diagram feature matrix and molecular diagram relation structure information of each substance to an encoder, and outputting context vectors of each substance through the encoder;
and inputting the context vector of each substance to the decoder, and outputting the initial operation steps corresponding to the chemical reaction formulas.
5. The method of claim 4, wherein the substance comprises reactants, conditions, and products; inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to an encoder, and outputting the context vector of each substance through the encoder, wherein the method comprises the following steps:
respectively acquiring a context vector of the reactant, a context vector of the condition object and a context vector of the product;
adding the context vector of the reactant and the context vector of the condition object to obtain an intermediate vector;
and combining the intermediate vector with the context vector of the product to obtain the context vector of the chemical reaction formula.
6. The method of claim 4, wherein the decoder comprises a GRU network decoder, the inputting the context vector to the decoder, outputting the initial operation corresponding to the chemical reaction, comprises:
inputting a word vector of a preset start word symbol and the context vector into a first GRU unit of a GRU network decoder, and outputting a word vector of a first predicted word symbol and a first hidden state, wherein the GRU network decoder comprises a plurality of GRU units;
And inputting the word vector of the first predicted word symbol and the first hidden state into a next GRU unit until the length of the word vector of the inputted preset ending word symbol or the predicted word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
7. The method according to any one of claims 1 to 6, wherein the training mode of the operation step prediction model comprises:
obtaining a first sample set, wherein the first sample set comprises a first chemically reactive sample marked with an operation step type;
obtaining a molecular diagram feature matrix sample and a molecular diagram relation structure information sample of a substance in the first chemical reaction type sample;
inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result;
and iteratively adjusting training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category until the preset requirement is met, so as to obtain the operation step prediction model.
8. The method according to any one of claims 1 to 6, wherein the step of screening the plurality of sets of operation steps using an automated test equipment and a preset optimization algorithm to determine an optimal operation step comprises:
For each group of operation steps in the multiple groups of operation steps, acquiring a chemical reaction operation of the substances in the chemical reaction formula by an automatic test device according to the group of operation steps to obtain a corresponding reaction result;
and determining the optimal operation steps from the multiple groups of operation steps according to the multiple reaction results and a preset optimization algorithm.
9. The method according to claim 8, wherein for each of the plurality of sets of operation steps, the step of performing a chemical reaction operation on the substance in the chemical reaction formula by the automated test equipment according to the set of operation steps to obtain a corresponding reaction result,
and aiming at each group of operation steps in the multiple groups of operation steps, converting each group of operation steps into instruction information identified by an automatic test device, wherein the instruction information is used for instructing the automatic test device to perform chemical reaction operation on the substances in the chemical reaction formula according to the group of operation steps to obtain a corresponding reaction result.
10. The method of claim 8, wherein the predetermined optimization algorithm comprises a bayesian optimization model, the reaction results comprise yields or purities of products, and wherein determining the optimal operation steps from the plurality of operation steps based on the plurality of reaction results and the predetermined optimization algorithm comprises:
Inputting the set of operation steps into an initialized proxy model, and fitting the initialized proxy model to obtain the yield or purity of the product, wherein the proxy model is used for representing prior distribution;
selecting a group operation step of an alternative test based on the result of the acquisition function on the prior distribution; the acquisition function is used for determining to explore new group operation step combinations or utilizing the group operation steps with the obtained experimental values based on the mean value and the variance given by the initialization agent model;
testing according to the group operation steps of the alternative test by using an automatic testing device to obtain the actual yield or purity of the product;
updating the prior distribution of the proxy model by using the yield or purity of the real product to obtain posterior distribution;
and repeatedly executing the data process of updating the prior distribution by using the posterior distribution data until the real product or purity meets the preset requirement.
11. A method and apparatus for generating a chemical reaction operating step based on a machine learning and automation test apparatus, comprising:
the first acquisition module is used for acquiring a molecular diagram feature matrix, molecular diagram relation structure information and molecular fingerprint characteristics of each substance in the chemical reaction formula;
The first prediction module is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information into an operation step prediction model and outputting an initial operation step corresponding to the chemical reaction formula;
the second prediction module is used for inputting the molecular fingerprint characteristics into a feeding sequence prediction model and outputting the feeding sequence of the substances in the chemical reaction formula;
and the generation module is used for matching each substance into the initial operation steps according to the feeding sequence to obtain the operation steps corresponding to the chemical reaction formulas.
12. The apparatus of claim 11, wherein the acquisition module comprises:
the first acquisition submodule is used for acquiring the atomic characteristics and the chemical bond characteristics of the molecular diagram of the substance in the chemical reaction formula;
and the first determining submodule is used for determining a molecular diagram feature matrix of each substance according to the atomic feature and the chemical bond feature.
13. The apparatus of claim 11, wherein the acquisition module comprises:
the second acquisition submodule is used for acquiring the connection relation of molecular diagram nodes of substances in the chemical reaction formula;
and the second determining submodule is used for determining molecular diagram relation structure information of each substance according to the connection relation.
14. The apparatus of claim 11, wherein the operation step prediction model comprises an encoder and a decoder, and wherein the first prediction module comprises:
the first extraction submodule is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to an encoder, and outputting the context vector of each substance through the encoder;
and the first prediction submodule is used for inputting the context vector of each substance to the decoder and outputting the initial operation step corresponding to the chemical reaction formula.
15. The apparatus of claim 14, wherein the substance comprises reactants, conditions, and products; the first extraction submodule includes:
the acquisition unit is used for inputting the molecular diagram feature matrix and the molecular diagram relation structure information of each substance to the encoder, and outputting the context vector of each substance through the encoder;
the processing unit is used for adding the context vector of the reactant and the context vector of the condition object to obtain an intermediate vector;
and the generating unit is used for combining the intermediate vector and the context vector of the product to obtain the context vector of the chemical reaction formula.
16. The apparatus of claim 14, wherein the decoder comprises a GRU network decoder, the first prediction submodule comprising:
the first prediction unit is used for inputting a word vector of a preset start word symbol and the context vector into a first GRU unit of the GRU network decoder, and outputting the word vector of the first prediction word symbol and a first hidden state, wherein the GRU network decoder comprises a plurality of GRU units;
and the second prediction unit is used for inputting the word vector of the first prediction word symbol and the first hidden state into the next GRU unit until the length of the word vector of the input preset end word symbol or the length of the prediction word symbol reaches a preset value, so as to obtain an initial operation step corresponding to the chemical reaction formula.
17. The apparatus of any one of claims 11 to 16, further comprising a first training module comprising:
a third acquisition sub-module for acquiring a first set of samples including a first chemically reactive sample labeled with a class of operational steps;
a fourth obtaining submodule, configured to obtain a molecular diagram feature matrix sample and a molecular diagram relationship structure information sample of a substance in the first chemical reaction formula sample;
The second prediction submodule is used for inputting the molecular diagram feature matrix sample and the molecular diagram relation structure information sample into an initial operation step prediction model to generate a prediction result;
the first generation sub-module is used for iteratively adjusting training parameters in the initial operation step prediction model based on the difference between the prediction result and the marked operation step category until the preset requirement is met, so as to obtain the operation step prediction model.
18. The apparatus according to any one of claims 11 to 16, further comprising:
the second acquisition module is used for acquiring a plurality of groups of operation steps corresponding to the chemical reaction formula;
the third acquisition module is used for acquiring the chemical reaction operation of the substances in the chemical reaction formula by an automatic test device according to each group of operation steps in the plurality of groups of operation steps to obtain a corresponding reaction result;
and the determining module is used for determining the optimal operation steps from the plurality of groups of operation steps according to a plurality of reaction results and a preset optimization algorithm.
19. The apparatus of claim 18, wherein the third acquisition module comprises:
And a fifth obtaining sub-module, configured to convert, for each set of operation steps in the multiple sets of operation steps, the each set of operation steps into instruction information identified by an automated test apparatus, where the instruction information is used to instruct the automated test apparatus to perform a chemical reaction operation on a substance in the chemical reaction formula according to the set of operation steps, so as to obtain a corresponding reaction result.
20. The apparatus of claim 18, wherein the pre-set optimization algorithm comprises a bayesian optimization model, the reaction results comprise yields or purities of products, and the determining module comprises:
a fitting sub-module, configured to input the set of operation steps to an initialized proxy model, and obtain a yield or purity of a product through fitting the initialized proxy model, where the proxy model is used to represent a priori distribution;
a selection sub-module for selecting a group operation step of an alternative test based on the result of the acquisition function on the prior distribution; the acquisition function is used for determining to explore new group operation step combinations or utilizing the group operation steps with the obtained experimental values based on the mean value and the variance given by the initialization agent model;
The test submodule is used for testing according to the group operation steps of the alternative test by utilizing an automatic test device to obtain the real yield or purity of the product;
an updating sub-module for updating the prior distribution of the proxy model by using the yield or purity of the real product to obtain posterior distribution;
and the iteration submodule is used for repeatedly executing the data process of updating the prior distribution by using the posterior distribution data until the real product or purity meets the preset requirement.
21. A system for generating chemical reaction operation steps based on machine learning and automation test devices, which is characterized in that: comprising the following steps:
the automatic test device comprises screening equipment, a reactor and detection equipment which are electrically connected;
an electronic device, comprising:
a processor, a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of generating the chemical reaction operating step of any one of claims 1 to 10.
22. A machine learning and automation test device-based chemical reaction operation step generation device, comprising:
a processor;
a memory for storing processor-executable instructions;
Wherein the processor is configured to perform the method of the chemical reaction operating step of any one of claims 1 to 10.
23. A non-transitory computer readable storage medium, characterized in that instructions in the storage medium, when executed by a processor, enable the processor to perform the method of the chemical reaction operating steps of any one of claims 1 to 10.
CN202210741235.0A 2022-06-28 2022-06-28 Method for generating operation steps based on machine learning and automatic test device Active CN114974450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210741235.0A CN114974450B (en) 2022-06-28 2022-06-28 Method for generating operation steps based on machine learning and automatic test device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210741235.0A CN114974450B (en) 2022-06-28 2022-06-28 Method for generating operation steps based on machine learning and automatic test device

Publications (2)

Publication Number Publication Date
CN114974450A CN114974450A (en) 2022-08-30
CN114974450B true CN114974450B (en) 2023-05-30

Family

ID=82964874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210741235.0A Active CN114974450B (en) 2022-06-28 2022-06-28 Method for generating operation steps based on machine learning and automatic test device

Country Status (1)

Country Link
CN (1) CN114974450B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220496A (en) * 2021-11-30 2022-03-22 华南理工大学 Deep learning-based inverse synthesis prediction method, device, medium and equipment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7599897B2 (en) * 2006-05-05 2009-10-06 Rockwell Automation Technologies, Inc. Training a support vector machine with process constraints
CN106777922A (en) * 2016-11-30 2017-05-31 华东理工大学 A kind of CTA hydrofinishings production process agent model modeling method
CN111312338A (en) * 2020-02-10 2020-06-19 华东理工大学 Product prediction method and system for aromatic hydrocarbon isomerization production link
US20220036182A1 (en) * 2020-07-29 2022-02-03 Samsung Electronics Co., Ltd. Method and apparatus for synthesizing target products by using neural networks
LU102103B1 (en) * 2020-09-30 2022-03-30 Wurth Paul Sa Computer System and Method Providing Operating Instructions for Thermal Control of a Blast Furnace
CN112037868B (en) * 2020-11-04 2021-02-12 腾讯科技(深圳)有限公司 Training method and device for neural network for determining molecular reverse synthetic route
CN113160902A (en) * 2021-04-09 2021-07-23 大连理工大学 Method for predicting enantioselectivity of chemical reaction product
CN113782109A (en) * 2021-09-13 2021-12-10 烟台国工智能科技有限公司 Reactant derivation method and reverse synthesis derivation method based on Monte Carlo tree
CN114360662A (en) * 2021-12-21 2022-04-15 武汉大学 Single-step inverse synthesis method and system based on two-way multi-branch CNN

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220496A (en) * 2021-11-30 2022-03-22 华南理工大学 Deep learning-based inverse synthesis prediction method, device, medium and equipment

Also Published As

Publication number Publication date
CN114974450A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
Häse et al. Next-generation experimentation with self-driving laboratories
Sverchkov et al. A review of active learning approaches to experimental design for uncovering biological networks
US20200097810A1 (en) Automated window based feature generation for time-series forecasting and anomaly detection
JP7439151B2 (en) neural architecture search
US11030275B2 (en) Modelling ordinary differential equations using a variational auto encoder
Bennett et al. Autonomous chemical science and engineering enabled by self-driving laboratories
Guo et al. A just-in-time modeling approach for multimode soft sensor based on Gaussian mixture variational autoencoder
US20230108920A1 (en) System and method for providing robust artificial intelligence inference in edge computing devices
Janz et al. Actively learning what makes a discrete sequence valid
CN114360662A (en) Single-step inverse synthesis method and system based on two-way multi-branch CNN
CN107292323B (en) Method and apparatus for training a hybrid model
Janz et al. Learning a generative model for validity in complex discrete structures
CN114974450B (en) Method for generating operation steps based on machine learning and automatic test device
Zheng et al. Physics-informed recurrent neural network modeling for predictive control of nonlinear processes
CN115206457A (en) Three-dimensional molecular structure generation method, device, equipment and storage medium
Xiao et al. Modeling and predictive control of nonlinear processes using transfer learning method
KR20210124402A (en) Desynthesis processing method and apparatus, electronic device, and computer-readable storage medium
US11721413B2 (en) Method and system for performing molecular design using machine learning algorithms
Faria et al. A data-driven tracking control framework using physics-informed neural networks and deep reinforcement learning for dynamical systems
Basha et al. Aspects of deep learning: hyper-parameter tuning, regularization, and normalization
CN115240785B (en) Chemical reaction prediction method, system, device and storage medium
US20230050627A1 (en) System and method for learning to generate chemical compounds with desired properties
Hwang et al. Promptable behaviors: Personalizing multi-objective rewards from human preferences
CN115312135B (en) Chemical reaction condition prediction method, system, device and storage medium
EP4303764A1 (en) A computer program product for analyzing data originating from at least one device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant