WO2023133678A1

WO2023133678A1 - Method for predicting chemical reaction

Info

Publication number: WO2023133678A1
Application number: PCT/CN2022/071283
Authority: WO
Inventors: 陈德铭; 马汝建; 陈志刚; 李革
Original assignee: 上海药明康德新药开发有限公司
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2023-07-20

Abstract

Disclosed is a method for predicting chemical reaction products, comprising: predicting reaction products of different reactions on the basis of an original training model trained using an original data set D0, calculating reactions of which the confidence is lower than a threshold value, and screening data to form a first data set D1; providing a second data set D2, and screening reactions similar to the chemical reactions in the first data set D1 as a third data set D3; and merging data of D3 into the original data set or independently using D3, and re-performing model training. The method of the present invention can improve the relation between the confidence and the true accuracy of prediction, so that high-confidence prediction has high accuracy, and the accuracy of reaction prediction is finally improved; moreover, the method also has the advantages of being small in data volume and short in time.

Description

A method for predicting chemical reactions

technical field

The present application relates to the field of computer technology, in particular to a method and device for predicting chemical reaction products.

Background technique

In the field of medicinal chemistry applications, the organic synthesis of new chemical molecules requires the prediction and judgment of chemical reactions imagined by organic chemists or virtualized by computer algorithms to avoid losses and waste caused by experimental failures.

The prediction accuracy of existing response prediction models is highly dependent on training data, and model performance may be limited by incomplete response data. Simply retraining models on low-supplementary response data cannot effectively address key responses of interest in specific application domains. For example, the important cyclization reaction in the design of organic synthesis, the more reaction data is not the better, the indiscriminate supplementation cannot effectively improve this category, or even reduce this type of reaction.

Contents of the invention

Based on this, it is necessary to provide a method and device for predicting chemical reaction products in view of the technical problem that the prediction accuracy of the current reaction prediction model is not high.

In one aspect, the present invention discloses a method for predicting chemical reaction products, the method comprising:

Step 1: Obtain one or more machine models that can generate reaction predictions and output their prediction credibility, calculate the corresponding credibility of the predicted products in each model through a given chemical reaction in the original data set, and count the overall of all models Reliability, screening the response data whose reliability is less than the threshold to obtain the first data set D1; wherein the threshold is any number from 0.3 to 0.9, preferably 0.4 to 0.8, more preferably 0.5 to 0.7. Such as about 0.5, 0.6, 0.7.

Step 2: Provide the second data set D2, calculate the similarity sim(w,v) between the chemical reaction W in D2 and the chemical reaction V in D1, and filter sim(w,v) in D2 to be greater than or equal to Supplementary data of similar responses to the threshold, the third data set D3 is obtained by means of collection, wherein the threshold is any value from 0.1 to 1, preferably 0.3 to 0.8, more preferably 0.5 to 0.8, such as about 0.6, .07 or 0.8;

Step 3: Merge the D3 data into the original dataset or use the D3 data to retrain the model.

In one embodiment, in step 1, the model features of K≧1 machine models capable of generating reaction predictions and outputting their prediction confidences are characterized by model parameters θt, where t=1, 2, . . . , K. Where t represents the tth model snapshot, and K is the number of collected model snapshots.

When the original training data D0 is available, the machine translation converter (Transformer) is selected as the original training model. In other embodiments, the machine model can be replaced by other models based on deep neural networks.

In one embodiment, in step 1, when the product information is known, confidence=p(Y|X,θt); when the reaction product information is unknown, Y _max =arg max _i (confidence=p(Yi|X ,θt)) to get (X,Y _max ), i is the i-th output prediction that the model can provide, preferably i≤10. t represents the tth model snapshot, K is the number of collected model snapshots, X represents the reactant of the chemical reaction, Y represents the product of the reaction, p represents the probability of the model output Y when X and θt are known, and Y _max is The predicted product of the model, arg max means to take the maximum value of all Y _i probabilities.

In one embodiment, in step 1, the overall confidence can be characterized as mean(confidence(X,θt)), or maximum value max(confidence(X,θt)), or those skilled in the art can Other statistical operations that are easily mastered.

In one embodiment, in step 2, when any reaction W∈D1, calculate its similarity sim(w,v) with any reaction V∈D2 under the model parameter θt, where sim(w,v )=sim(w=encoding(W), v=encoding(V)); where w, v are the encoding (encoding) of the model θt’s response to the input V, W respectively.

In one embodiment, in step 2, the amount of D3 data is less than the original training data D0, and for each response in D1, the number of responses supplemented by D3 can be controlled within one hundred, preferably |D3 |≤|D0|, |D3|≤50×|D1|.

In an embodiment scheme, in step 3, in the original training data D0, randomly sample R times the amount of data of D3, merge with D3, generate a new data set, and retrain the reinitialized machine model parameters; R Can be selected from the range [0.5,max(1,|D0|/|D3|)].

In one embodiment, in step 3, D3 is used to generate a new data set, and the re-initialized machine model parameters are retrained.

Description of drawings

Fig. 1 shows a schematic flow diagram of a method for predicting compound reaction products.

Figure 2 shows the number of neighbor responses similar to 7 false responses.

Figure 3 shows the number of similar neighbor responses among the 12 correct responses.

Detailed ways

The present invention will be described in detail below based on the embodiments and in conjunction with the accompanying drawings. The above aspects of the invention and other aspects of the invention will be apparent from the following detailed description. The scope of the present invention is not limited to the following examples.

As shown in Figure 1, the present invention discloses a method for predicting chemical reaction products, said method comprising:

Step 1: Based on the original training model, predict the reaction products of different reactions and calculate the "under-learned" reactions whose reliability is lower than the threshold, screen these data and form the first data set D1.

Step 2: Screen similar reactions to the "under-learned" chemical reactions as the third data set D3.

Step 3: Merge the D3 data into the original data set and retrain the model.

In step 1, first obtain one or more machine models that can generate reaction predictions and output their prediction credibility, and then calculate the corresponding credibility of the predicted products in each model through a given chemical reaction in the original data set, and The overall credibility of all models is counted, and finally the response data whose reliability is less than a threshold such as 0.5 is screened to obtain the first data set D1.

In step 1, the model features of K≥1 machine models capable of producing reaction predictions and outputting their prediction reliability are characterized by model parameters θt, where t=1, 2, ..., K represent. When the product information is known, confidence=p(Y|X,θt); when the reaction product information is unknown, Y _max ＝arg max _i (confidence=p(Yi|X,θt)) to get (X,Y _max ) , i is the i-th output prediction that the model can provide, i≤10. The overall confidence can be characterized as mean(confidence(X, θt)), or maximum value max(confidence(X, θt)), or other statistical operations that can be easily grasped by those skilled in the art.

The specific calculation method of Confidence can be the X part of the reaction data. After the weight calculation of each layer of the multi-layer neural network of the trained machine model Transformer, all possible M in the output product are obtained in the output layer of the model. The original weight zi(>0) of the element symbol, i=1,2,...,M, and the normalized probability calculation is performed by the following Softmax as the confidence of each character i, and the element symbol sequence with the highest probability is output as the prediction Y.

In other embodiments, the machine model can be replaced by other deep neural networks, and the same Softmax is used to calculate the output layer, but the symbol form of the output element is changed.

In step 2, first provide the second data set D2, then calculate the similarity sim(w,v) between the chemical reaction W in D2 and the chemical reaction V in D1, and finally screen sim(w,v) in D2 v) Supplementary data of similar reactions that are greater than or equal to the threshold value are aggregated to obtain the third data set D3, wherein the threshold value is any value between 0.1 and 1. When any reaction W∈D1, calculate the similarity sim(w,v) between it and any reaction V∈D2 under the model parameter θt, where sim(w,v)=sim(w=encoding(W),v =encoding(V)); where w, v are the encodings (encoding) of the input responses V, W of the model θt respectively. The amount of D3 data is less than the original training data D0, and for each response in D1, the number of responses added by D3 can be controlled within one hundred, preferably, |D3|≤|D0|, |D3|≤50× |D1|. In a specific embodiment, the reaction W is taken as an example, w=f(W,θt)=[w1,w2,...wn], f(W,θt) is specifically the parameter of the reaction W input to the model θt through each layer Calculate the vector representation of the layer before the output prediction element, where n belongs to the model preset parameter representing the length of the vector; similarly, v=f(V,θt)=[v1,v2 can be obtained for each response V of D2 ,....vn]; n can be selected in the length range from 2 ⁶ =64 to 2 ¹² =4096, preferably n=256.

sim(w,v) can be implemented as the normalized reciprocal of its Euclidean distance (+1 to avoid the divisor being 0), or the normalized similarity that can be grasped by those skilled in the art:

In step 3, merge the D3 data into the original data set, and retrain the model. First, randomly sample R times the data volume of D3 in the original training data D0, merge with D3 to generate a new data set, and then retrain the re-initialized machine model parameters; R can be selected from [0.5, max(1, |D0|/|D3|)] range.

Or use D3 for fine-tuning learning, that is, the model θt is trained for F≥1 iterations on the D3 data, and the model parameters are continuously updated.

The prediction method of embodiment 1 chemical reaction product

1. Based on the original training model trained using the original data set D0, predict the reaction products of different reactions and calculate the reactions whose reliability is lower than the threshold, screen these data and form the first data set D1

In this embodiment, when the original training data D0 is available, D0 comes from the 400,000 training data of the U.S. Patent Data Office (USPTO), which is a public data set, and the machine translation converter (Transformer) (Philippe Schwaller et al. Molecular transformer) is selected. : A model for uncertainty-calibrated chemical reaction prediction, 2019 Sep 25; 5(9):1572-1583; Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Ill ia Polosukhin. Attention is all you need.In Advances in neural information processing systems, pp.5998–6008,2017) as the original training model, in other embodiments, the machine model can be replaced by other deep neural network (Coley, Connor W., et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chemical science 10.2(2019): 370-377.; John Bradshaw, Matt J. Kusner, Brooks Paige, Marwin H.S. Segler, José Miguel Hernández- Lobato, A Generative Model For Electron Paths, https://arxiv.org/abs/1805.10970), only the output element symbol form has changed. And record K>=1 model snapshots in the training iteration process using D0; the characteristics of the model snapshots can be described by the model parameters θt, t=1,2,...,K, where t represents the tth model snapshot, K is the number of model snapshots collected.

The selection of θt, t=1,2,...,K is selected according to different iterations of model training; each round of model is updated according to the parameter θt of each sample of training data, which is called an epoch iteration; K can be selected as epoch The total number, θt corresponds to the model of the t-th epoch iteration. When K=1, specifically select the model trained to the last epoch iteration. In other embodiments, an epoch may also be set as a certain number of iteration intervals, such as every 1000 iterations as an epoch.

Given the chemical reaction data (X,Y) to be analyzed, X represents the reactant of the chemical reaction and Y represents the product of the reaction. The credibility confidence=p(Y|X,θt) can be calculated through the parameter θt of the model snapshot, where p represents the probability of the model outputting Y when X and θt are known; if only X is given, the model can pass Y _max = arg max _i (confidence=p(Y _i |X, θ)) to get (X, Y _max ), Y _max is the predicted product of the model, and arg max means to take the maximum value of all Y _i probabilities. i is the i-th highest score output prediction obtained by the model through beam-search, i≤10.

The specific calculation method of Confidence can be the X part of the reaction data. After the weight calculation of each layer of the multi-layer neural network of the trained machine model Transformer, in the output layer of the model, all possible M element symbols in the output product can be obtained. The original weight zi(>0), i=1,2,...,M, and the following Softmax is used for normalized probability calculation as the confidence of each character i, and the element symbol sequence with the highest probability is output as the prediction Y.

For the above-mentioned chemical reaction data set to be analyzed, select the "under-learned" reaction data set D1 with confidence<th; th represents the reliability threshold, and th is selected from 0.5.

In other embodiments, the reliability threshold range may be 0.3 to 0.9, preferably 0.4 to 0.8, more preferably 0.5-0.7, such as 0.5, 0.6 or 0.7.

2. Provide the second data set D2, and screen for similar reactions to the chemical reactions in the first data set D1 as the third data set D3

Provide the candidate supplementary reaction data set D2={(X’,Y’)} of screening reactions, D2 comes from the supplementary database of USPTO, USPTO Stereo has about 1 million reactions. For any response W∈D1 of D1, the present invention calculates its similarity with any response V∈D2 under the model parameter θ sim(W, V)=sim(w=encoding(W), v=encoding(V) ); where w and v are the encoding functions of the model θ to the input responses V and W respectively.

The implementation of the calculation of the similarity sim(w,v) is described as follows. Taking the response W as an example, the encoding vector is expressed as w=f(W,θ)=[w ₁ ,w ₂ ,….w _n ], f( W, θ) is specifically the response W input to the model θ through the parameter calculation of each layer, and the vector representation of the layer before the output prediction element, where n belongs to the model preset parameter indicating the length of the vector; similarly, each of D2 can be One reaction V obtains encoding vector v=f(V,θ)=[v ₁ ,v ₂ ,….v _n ]; n can be selected in the length range from 2 ⁶ =64 to 2 ¹² =4096, this embodiment Choose n=256.

sim(w,v) can be implemented as the normalized reciprocal of its Euclidean distance (+1 to avoid the divisor being 0), or the normalized similarity that can be mastered by those skilled in the art:

Set the relevant similarity threshold th2, th2 is set from the range [0.1,1].

Screen similar response supplementary data sets that meet sim(encoding(W), encoding(V))≥th2, and th2∈[0.1,1] is the similarity threshold; obtain the set D3 of similar response supplementary data by taking a collection method. In the examples, we use th2 = 0.6, and analyze the relevant supplementary result samples of 0.7, 0.8.

3. Merge D3 data into the original data set or use D3 alone to retrain the model

In one of the experiments of this embodiment, the model θ uses D3 for fine-tuning (fine-tuning) learning, that is, the model θ is trained on the D3 data for F>=1 iterations, and the model parameters are continuously updated. Example 2 This scheme is used in .

In another variant of this embodiment, when the model θ and its corresponding original training data (denoted as D0) both exist, use D0 and D3 to select the responses of two sets according to the ratio of |D0|:|D3| Data, that is, directly merge D0 and D3 as a new data set as option 1, and retrain the reinitialized machine model.

Optionally, the variant of this embodiment can use the results obtained by fine-tuning and retrain as N=2 option models, in the test response data set D1 (or other additional test response data sets provided), select the option model The accuracy rate is improved the most, and the prediction/verification response is the category with a confidence Confidence>0.9, or the category with a stricter Confidence>0.99, which has the largest number of improvements as the final model.

The detection of embodiment 2 accuracy rate

The experimental results based on the examples are shown below. Carry out according to the step in embodiment 1, wherein, what machine learning model θ adopts is Transformer, and its encoding vector dimension selects n=256, and the number of training iterations is 500,000 times, and a small batch (batch) is processed in each iteration 4096 characters (tokens). The training response data is 400,000 training data from the public data set US Patent Data Office (USPTO), and θ is the model output from the last iteration. The 400,000 USPTO training data is recorded as D0.

The reaction dataset D1 for the under-learning analysis test is about 1381 reactions extracted from basic organic chemistry books by in-house chemists. The candidate response data set D2 to be supplemented comes from the USPTO data set that does not overlap with D0, with a total of 400,000 responses; it is worth noting that the background of the response data commercial service contains a large number of responses, but this type of service only provides a small amount of data query. Data cannot be fetched in bulk. For example, on the Reaxys www.reaxys.com page, it contains more than 55 million responses, but only about ten responses can be seen on one page of query results.

In the verification experiment, the baseline model (baseline) trained only through D0 was compared with each response in D1 through this inventive method, and the D2 data set containing 400,000 candidate supplementary responses was obtained from the D2 data set containing similarity ≥ 0.6. The D3 data set with about 14,000 responses was processed by fine-tune as described in the example to obtain a reliability improvement model.

The Top-k accuracy rate indicates the k different possible products with the highest reliability predicted by the model, one of which is completely consistent with the real product. The top-1 accuracy rate is the most likely product predicted by the model, and the proportion of all reactions that are completely consistent with the real product.

As shown in Table 1, the results of this experiment prove that, using the screening method of the present invention, only about ten similar responses for each response of D1 can significantly improve the prediction effect on the original model, whether it is the overall Top-1 accuracy rate or The coverage rate of correct prediction with high reliability has been significantly improved, the accuracy rate of Top-1 has increased by 22.6%, the coverage rate of Confidence>0.9 has increased by 20.86% respectively, and the prediction of this confidence interval has reached 93.9%. Top-1 accuracy rate.

Table 1

In further experiments, for the 200 test responses screened with low confidence Confidence<0.5, before using this invention for screening and improvement, the response prediction accuracy of this set is only 8.5%, and the verification is "under-learning "reaction. After this invention screened and fine-tuned the baseline model, the average test Confidence increased from 0.378 to 0.796, and the Top-1 accuracy rate increased to 60.5%, which verified the improvement of the inventive method on the accuracy and reliability of response prediction.

Table 2

Confidence<0.5筛选200测试反应Confidence<0.5 screens 200 test responses	基线模型baseline model	可信度改善模型Credibility Improvement Model
平均ConfidenceAverage Confidence	0.3780.378	0.7960.796
Top-1准确率Top-1 accuracy	8.5％8.5%	60.5％60.5%

On the other hand, for another randomly selected 100 wrong test responses, the raw accuracy rate is 0, of which 33 have Confidence>0.9, and 16 have Confidence>0.8. The experiment supplemented the data without difference, that is, no similar response was supplemented for its response prediction Confidence threshold. The accuracy rate of Top-1 in this part of the test response after supplementation is 14%, that is, if supplementation is not based on the Confidence threshold and similarity, the accuracy rate improvement is limited.

table 3

For the 7 responses with Confidence>0.9 (high conf in Figure 2) and still predicting errors after indifferent supplementation, the reason for the analysis is the lack of similar training or supplementary response data, that is, sim_threshold>=0.6 or 0.7 or 0.8 The number of similar neighbor reactions is very small (Figure 2). In contrast, for the 12 responses with Confidence>0.9 and correct prediction, the number of similar neighbor responses was significantly more (Figure 3). The experimental results further illustrate the necessity of the invention to improve the reaction prediction results by combining reliability and screening similar reactions.

Those skilled in the art will appreciate that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications can be made thereto without departing from the spirit and scope of the invention. Therefore, the detailed description and examples of the present invention should not be considered as limiting the scope of the present invention. The invention is limited only by the appended claims. All documents cited herein are hereby incorporated by reference in their entirety.

Claims

A method for predicting chemical reaction products, comprising the following steps:

Step 1: Based on the original training model trained with the original data set D0, predict the reaction products of different reactions and calculate the responses whose reliability is lower than the threshold, screen these data and form the first data set D1.

Step 2: Provide a second data set D2, and screen reactions similar to the chemical reactions in the first data set D1 as a third data set D3.

Step 3: Merge the D3 data into the original dataset or use D3 alone to retrain the model.
The method of claim 1, wherein said step 1 comprises:

obtain one or more machine models that generate response predictions and output confidence in their predictions;

Calculate the corresponding credibility of the predicted product in each model through a given chemical reaction in the original data set, and count the overall credibility of all models;

Screening the response data whose reliability is less than the threshold to obtain the first data set D1;

Wherein, the threshold is any number from 0.3 to 0.9, preferably 0.4 to 0.8, more preferably 0.5 to 0.7, such as about 0.5.
The method according to claim 2, wherein, when the product information is known, confidence=p(Y|X, θt); when the reaction product information is unknown, Ymax=arg max i (confidence=p(Yi|X, θt)) to get (X, Ymax), i is the i-th output prediction that the model can provide, t represents the t-th model snapshot, t=1,2,...,K, K is the number of collected model snapshots, X Represents the reactant of the chemical reaction, Y represents the product of the reaction, p represents the probability of the model output Y when X and θt are known, Y max is the predicted product of the model, and arg max represents the maximum value of all Y i probabilities.
The method according to claim 1, wherein said step 2 comprises:

providing a second data set D2;

For the chemical reaction W in D2, calculate its similarity sim(w, v) with the chemical reaction V in D1;

In D2, filter the supplementary data of similar responses whose sim(w,v) is greater than or equal to the threshold, and obtain the third data set D3 by means of collection;

The threshold is any number from 0.1 to 1, preferably 0.3 to 0.8, more preferably 0.5 to 0.8, such as about 0.6, 0.7 or 0.8.
The method as claimed in claim 4, wherein sim(w, v)=sim(w=encoding(W), v=encoding(V)); wherein w, v are the codes of model θt to input responses V and W respectively (encoding).
The method of claim 5, wherein

w=f(W,θt)=[w1,w2,...wn], f(W,θt) is specifically the response W input to the model θt through the calculation of the parameters of each layer, and the vector representation of the layer before the output prediction element , where n belongs to the model preset parameter representing the length of the vector; the response V obtains v=f(V,θt)=[v1,v2,….vn]; preferably, n can be in the range of 2 6 =64 to 2 12 =4096 Choose any number in the length range of .
The method according to claim 1, wherein said step 3 comprises:

In the original training data D0, randomly sample R times the data volume of D3, merge with D3 to generate a new data set, and then retrain the reinitialized machine model parameters. Preferably, R can be selected from 0.5 to max(1 ,|D0|/|D3|); or

Use D3 for fine-tuning learning, that is, the model θt is trained for F≥1 iterations on the D3 data, and the model parameters are continuously updated.
A device for predicting chemical reaction products, the device comprising:

The first prediction module is used to predict the reaction products of different reactions based on the original training model and calculate the reactions whose reliability is lower than the threshold, screen these data and form the first data set D1;

The second prediction module is used to provide the second data set D2, and screen similar reactions to the chemical reactions in the first data set D1 as the third data set D3;

The third prediction module is used to merge D3 data into the original data set or use D3 alone to retrain the model.
The device according to claim 8, wherein the first prediction module is specifically used for:

obtain one or more machine models that generate response predictions and output confidence in their predictions;

Calculate the corresponding credibility of the predicted product in each model through a given chemical reaction in the original data set, and count the overall credibility of all models;

Screening the response data whose reliability is less than the threshold to obtain the first data set D1;

Wherein, the threshold is any number from 0.3 to 0.9, preferably 0.4 to 0.8, more preferably 0.5 to 0.7, such as about 0.5.
The device according to claim 9, wherein, when the product information is known, confidence=p(Y|X, θt); when the reaction product information is unknown, Ymax=arg maxi(confidence=p(Yi|X, θt )) to get (X, Ymax), i is the i-th output prediction that the model can provide, t represents the t-th model snapshot, t=1,2,...,K, K is the number of collected model snapshots, X represents The reactant of the chemical reaction, Y represents the product of the reaction, p represents the probability of the model output Y when X and θt are known, Y max is the predicted product of the model, and arg max represents the maximum value of all Y i probabilities.
The device according to claim 8, wherein the second prediction module is specifically used for:

providing a second data set D2;

For the chemical reaction W in D1, calculate the similarity sim(w,v) between it and the chemical reaction V in D2;

In D2, filter the supplementary data of similar responses whose sim(w,v) is greater than or equal to the threshold, and obtain the third data set D3 by means of collection;

The threshold is any number from 0.1 to 1, preferably 0.3 to 0.8, more preferably 0.5 to 0.8, such as about 0.6, .07 or 0.8.
The apparatus according to claim 11, wherein, sim(w, v)=sim(w=encoding(W), v=encoding(V)); wherein w, v are model θt responses to input V, W respectively encoding, preferably

w=f(W,θt)=[w1,w2,...wn], f(W,θt) is specifically the response W input to the model θt through the calculation of the parameters of each layer, and the vector representation of the layer before the output prediction element , where n belongs to the model preset parameter representing the length of the vector; the response V obtains v=f(V,θt)=[v1,v2,….vn]; preferably, n can be in the range of 2 6 =64 to 2 12 =4096 Any number in the length range of .
The device according to claim 8, wherein the third prediction module is specifically used for:

In the original training data D0, randomly sample R times the data volume of D3, merge with D3 to generate a new data set, and then retrain the reinitialized machine model parameters. Preferably, R can be selected from 0.5 to max(1 ,|D0|/|D3|)]; or

Use D3 for fine-tuning learning, that is, the model θt is trained for F≥1 iterations on the D3 data, and the model parameters are continuously updated.
A kind of equipment, described equipment comprises processor namely storage, and described storage is used for storing computer program, and described processor is used for carrying out the compound reaction product prediction according to any one of claim 1-7 according to said computer program method.
A computer-readable storage medium, the computer-readable storage medium is used to store a computer program, and the computer program is used to execute the compound reaction product prediction method according to any one of claims 1-7.