WO2022113945A1 - 情報処理システム、情報処理方法、および情報処理プログラム - Google Patents

情報処理システム、情報処理方法、および情報処理プログラム Download PDF

Info

Publication number
WO2022113945A1
WO2022113945A1 PCT/JP2021/042833 JP2021042833W WO2022113945A1 WO 2022113945 A1 WO2022113945 A1 WO 2022113945A1 JP 2021042833 W JP2021042833 W JP 2021042833W WO 2022113945 A1 WO2022113945 A1 WO 2022113945A1
Authority
WO
WIPO (PCT)
Prior art keywords
information processing
regression
machine learning
component objects
processing system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/042833
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
恭平 花岡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Resonac Corp
Original Assignee
Showa Denko Materials Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Showa Denko Materials Co Ltd filed Critical Showa Denko Materials Co Ltd
Priority to EP21897918.5A priority Critical patent/EP4243026A4/en
Priority to KR1020237021006A priority patent/KR20230110584A/ko
Priority to CN202180089147.0A priority patent/CN116745850A/zh
Priority to JP2022565331A priority patent/JP7803284B2/ja
Priority to US18/254,384 priority patent/US20240047018A1/en
Publication of WO2022113945A1 publication Critical patent/WO2022113945A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C60/00Computational materials science, i.e. ICT specially adapted for investigating the physical or chemical properties of materials or phenomena associated with their design, synthesis, processing, characterisation or utilisation

Definitions

  • One aspect of this disclosure relates to information processing systems, information processing methods, and information processing programs.
  • Patent Document 1 describes a method for predicting the binding property between the three-dimensional structure of a biopolymer and the three-dimensional structure of a compound.
  • a step of generating a predicted three-dimensional structure of a complex of a biopolymer and a compound based on the three-dimensional structure of a biopolymer and a three-dimensional structure of a compound, and the collation of the predicted three-dimensional structure with an interaction pattern are performed.
  • the step of converting to a predicted three-dimensional structure vector representing the result and the step of predicting the bondability between the three-dimensional structure of the biopolymer and the three-dimensional structure of the compound by discriminating the predicted three-dimensional structure vector using a machine learning algorithm include.
  • the component objects are diverse or exist in large numbers, it is not possible to prepare a sufficient amount of data for these component objects, and as a result, the accuracy of analysis of the composite object reaches the expected level. May not. Therefore, a mechanism for improving the accuracy of analysis of a composite object is desired even when a sufficient amount of data cannot be prepared for the component object.
  • the information processing system includes at least one processor. At least one processor gets the numerical representations and compound ratios for each of the multiple component objects and performs machine learning based on the multiple numerical representations to calculate multiple regression parameters for the multiple component objects. Then, a plurality of composite ratios are applied to the regression model defined by the plurality of regression parameters, and a predicted value indicating the characteristics of the composite object obtained by combining the plurality of component objects is calculated.
  • the information processing method is executed by an information processing system including at least one processor.
  • This information processing method involves obtaining numerical representations and compound ratios for each of multiple component objects, and performing machine learning based on multiple numerical representations to perform multiple regression parameters for multiple component objects. And a step to calculate the predicted value indicating the characteristics of the composite object obtained by applying multiple composite ratios to the regression model defined by multiple regression parameters and combining multiple component objects. including.
  • the information processing program performs machine learning based on a step of acquiring a numerical representation and a compound ratio for each of a plurality of component objects and a plurality of numerical representations to form a plurality of component objects.
  • a prediction that shows the characteristics of a composite object obtained by compounding multiple component objects by applying multiple composite ratios to the step of calculating the corresponding multiple regression parameters and the regression model defined by the multiple regression parameters.
  • machine learning is executed based on the data of each component object, and a plurality of regression parameters corresponding to a plurality of component objects are calculated. Then, the composite ratio is applied to the regression model defined by the regression parameter, and the characteristics of the composite object are predicted.
  • the accuracy of analysis of compound objects can be improved even when a sufficient amount of data cannot be prepared for component objects.
  • the information processing system 10 is a computer system that executes analysis on a composite object obtained by combining a plurality of component objects at a given composite ratio.
  • a component object is a tangible or intangible object used to create a composite object.
  • Composite objects can be tangible or intangible.
  • An example of a tangible object is any substance or object.
  • Data and information are examples of intangibles.
  • "Composite of a plurality of component objects” means a process of converting a plurality of component objects into one object, that is, a compound object.
  • the method of combining is not limited, and may be, for example, compounding, compounding, synthesis, binding, mixing, merging, combination, compounding, or coalescence, or other methods.
  • the analysis of a compound object is a process for obtaining data showing some characteristics of the compound object.
  • Multiple component objects may be any multiple types of materials, in which case the composite object is a multi-component substance produced by those materials.
  • a material is any component used to produce a multi-component substance.
  • the plurality of materials may be any plurality of types of molecules or atoms, and in this case, the composite object is a multi-component substance obtained by combining those molecules or atoms by any method.
  • the material may be a polymer or a monomer, correspondingly the multi-component material may be a polymer alloy.
  • the material may be a monomer, and correspondingly, the multi-component substance may be a polymer.
  • the material may be a drug, i.e. a chemical substance having a pharmacological action, and correspondingly, the multi-component substance may be a drug.
  • the information processing system 10 executes machine learning for analysis of complex objects.
  • Machine learning is a method of learning based on given information and autonomously finding a law or rule.
  • the specific method of machine learning is not limited.
  • the information processing system 10 may execute machine learning using a machine learning model which is a calculation model including a neural network.
  • a neural network is a model of information processing that imitates the mechanism of the human cranial nerve system.
  • the information processing system 10 includes a graph neural network (GNN), a convolutional neural network (CNN), a recursive neural network (RNN), an attention RNN (Attention RNN), and a multi-head attention (Multi-).
  • Machine learning may be performed using at least one of the Head Attentions).
  • the information processing system 10 is composed of one or more computers. When a plurality of computers are used, one information processing system 10 is logically constructed by connecting these computers via a communication network such as the Internet or an intranet.
  • FIG. 1 is a diagram showing an example of a general hardware configuration of a computer 100 constituting an information processing system 10.
  • the computer 100 includes a processor 101 such as a CPU that executes an operating system, an application program, and the like, a main storage unit 102 composed of a ROM and a RAM, and an auxiliary storage unit 103 composed of a hard disk, a flash memory, and the like.
  • a communication control unit 104 composed of a network card or a wireless communication module, an input device 105 such as a keyboard and a mouse, and an output device 106 such as a monitor are provided.
  • Each functional element of the information processing system 10 is realized by reading a predetermined program on the processor 101 or the main storage unit 102 and causing the processor 101 to execute the program.
  • the processor 101 operates the communication control unit 104, the input device 105, or the output device 106 according to the program, and reads and writes data in the main storage unit 102 or the auxiliary storage unit 103.
  • the data or database required for processing is stored in the main storage unit 102 or the auxiliary storage unit 103.
  • FIG. 2 is a diagram showing an example of the functional configuration of the information processing system 10.
  • the information processing system 10 includes an acquisition unit 11, a calculation unit 12, and a prediction unit 13 as functional elements.
  • the acquisition unit 11 is a functional element that acquires data related to a plurality of component objects. Specifically, the acquisition unit 11 acquires a numerical expression and a compound ratio for each of the plurality of component objects.
  • the numerical representation of a component object is data that expresses an arbitrary attribute of a component object using a plurality of numerical values.
  • the attributes of a component object are the properties or characteristics of the component object. Numerical representations may be visualized by various methods, for example, numbers, letters, texts, molecular graphs, vectors, images, time series data, etc., or any two of these methods. It may be visualized by the above combination.
  • the individual numerical values constituting the numerical representation may be expressed in decimal notation, or may be expressed in other notations such as binary notation and hexadecimal notation.
  • the compound ratio of component objects is the ratio between multiple component objects.
  • the specific type, unit, and expression method of the compound ratio are not limited, and may be arbitrarily determined depending on the component object or the compound object.
  • the compound ratio may be represented by a ratio such as a percentage, a histogram, or an absolute quantity of individual component objects.
  • the calculation unit 12 is a functional element for calculating the regression parameters of the regression model for predicting the characteristics of the composite object. Specifically, the calculation unit 12 executes machine learning based on a plurality of numerical representations corresponding to a plurality of component objects to calculate regression parameters.
  • the regression model is an expression for obtaining the value of one or more objective variables y when the value of one or more explanatory variables x is given.
  • the regression model may be a linear regression model or a non-linear regression model.
  • An example of a regression model is Scheffe polynomial. However, the regression model may be another parametric model. Regression parameters are numerical values included in the regression model.
  • the prediction unit 13 is a functional element that predicts the characteristics of the composite object and outputs the predicted value.
  • the characteristics of a composite object are the peculiar properties of a composite object.
  • the prediction unit 13 applies a composite ratio to the regression model defined by the calculated regression parameters to calculate the predicted value.
  • the prediction unit 13 substitutes a plurality of compound ratios into the regression model to calculate the prediction value.
  • the combination of the calculation unit 12 and the prediction unit 13 is realized by one machine learning model.
  • the calculation unit 12 may be realized by a machine learning model
  • the prediction unit 13 may be realized by an algorithm that does not use a machine learning model.
  • each of at least one machine learning model used in this embodiment is a trained model expected to have the highest estimation accuracy, and therefore can be called the "best machine learning model".
  • the trained model is generated by a given computer processing teacher data containing many combinations of input vectors and labels.
  • a given computer inputs an input vector into a machine learning model, calculates an output value, and finds the error between the output value and the label shown in the teacher data.
  • the output value is, for example, a predicted value. It can be said that the error between the output value and the label is the difference between the estimation result and the correct answer.
  • the computer updates a given parameter in the machine learning model based on that error.
  • the computer generates a trained model by repeating such learning.
  • the computer that generates the trained model is not limited, and may be, for example, the information processing system 10 or another computer system.
  • the process of generating a trained model can be called the learning phase, and the process of using the trained model can be called the operation phase.
  • the entire machine learning model used in this embodiment may be described by a function that does not depend on the input order. With this mechanism, it is possible to eliminate the influence of the order of multiple vectors in machine learning.
  • FIG. 3 is a flowchart showing an example of the operation of the information processing system 10 as a processing flow S1.
  • the processing flow S1 corresponds to the operation phase.
  • step S11 the acquisition unit 11 acquires the numerical representation and the compound ratio for each of the plurality of component objects.
  • the acquisition unit 11 may, for example, numerically represent the component object Ea ⁇ 1,1,2,3,4,3,3,5. 6,7,5,4 ⁇ , the numerical representation of the component object Eb ⁇ 1,1,5,6,3,3,5,1,7,0,0 ⁇ , and the composite of the component objects Ea and Eb.
  • each numerical representation is shown as a vector.
  • the compound ratio ⁇ 0.7, 0.3 ⁇ means that the component objects Ea and Eb are used in a ratio of 7: 3 to obtain a compound object.
  • the acquisition unit 11 may acquire the data of each of the plurality of component objects by any method.
  • the acquisition unit 11 may access a given database to read data, may receive data from another computer or computer system, or may receive data input by a user of the information processing system 10. You may accept it.
  • the acquisition unit 11 may acquire data by any two or more of these methods.
  • the calculation unit 12 calculates a feature vector for each of the plurality of component objects based on a numerical expression.
  • the feature vector is a vector showing the features of the component object.
  • the characteristics of a component object are any elements that make the component object different from other objects.
  • a vector is an n-dimensional quantity having n numerical values, and can be expressed as a one-dimensional array.
  • step S13 the calculation unit 12 calculates a plurality of regression parameters corresponding to a plurality of component objects based on the calculated plurality of feature vectors.
  • step S14 the prediction unit 13 calculates a prediction value indicating the characteristics of the composite object by using a regression model defined by a plurality of calculated regression parameters.
  • the regression model defined by the regression parameter is, in short, a regression model in which a specific specific numerical value is determined as the regression parameter.
  • the prediction unit 13 applies a plurality of compound ratios to the regression model to calculate a prediction value.
  • step S15 the prediction unit 13 outputs the predicted value.
  • the method of outputting the predicted value is not limited.
  • the prediction unit 13 may store the predicted value in a given database, send it to another computer or computer system, or display it on a display device.
  • the prediction unit 13 may output the predicted value to another functional element for subsequent processing in the information processing system 10.
  • 4 and 5 are both diagrams showing an example of a procedure for calculating regression parameters.
  • the component object represents three materials (polymers): polystyrene, polyacrylic acid, and butyl polymethacrylic acid. Any form of numerical representation may be provided for each of these materials.
  • step S121 which is part of step S12, the calculator 12 uses a machine learning model for an embedded function to calculate the features of the vector, from a numerical representation to the feature vector Z for each of the plurality of component objects. Is calculated.
  • This machine learning model is a trained model.
  • the input vector and the output vector have a one-to-one relationship.
  • the input vector is a numerical representation and the output vector is the feature vector Z.
  • the calculation unit 12 inputs a plurality of numerical representations corresponding to the plurality of component objects into the model for the embedded function, and calculates the feature vector Z of each of the plurality of component objects.
  • the calculation unit 12 inputs the numerical representation corresponding to the component object into the model for the embedded function for each of the plurality of component objects, and calculates the feature vector Z of the component object.
  • the model for the embedded function may generate a feature vector Z, which is a fixed-length vector, from a numerical representation, which is atypical data. Atypical data refers to data that is not represented by a fixed-length vector.
  • the calculation unit 12 calculates the feature vector Z 1 corresponding to polystyrene, the feature vector Z 2 corresponding to polyacrylic acid, and the feature vector Z 3 corresponding to butyl polymethacrylic acid.
  • the machine learning model for the embedded function is not limited, and may be decided by an arbitrary policy in consideration of factors such as the types of component objects and composite objects.
  • the calculation unit 12 may execute the embedding function using a graph neural network (GNN), a convolutional neural network (CNN), or a recurrent neural network (RNN).
  • GNN graph neural network
  • CNN convolutional neural network
  • RNN recurrent neural network
  • step S122 which is a part of step S12, the calculation unit 12 separates the plurality of component objects from the feature vector Z by the machine learning model for the interaction function for interacting the plurality of vectors.
  • the feature vector M of is calculated.
  • This machine learning model is a trained model.
  • the input vector and the output vector have a one-to-one relationship.
  • the input vector is the feature vector Z and the output vector is the feature vector M.
  • the calculation unit 12 inputs a set of a plurality of feature vectors Z corresponding to the plurality of component objects into the model for the interaction function, and calculates the feature vector M for each of the plurality of component objects.
  • the calculation unit 12 calculates the feature vector M 1 corresponding to polystyrene, the feature vector M 2 corresponding to polyacrylic acid, and the feature vector M 3 corresponding to butyl polymethacrylic acid.
  • the machine learning model for the interaction function is not limited, and it may be decided by an arbitrary policy in consideration of factors such as the types of component objects and compound objects.
  • the calculation unit 12 may execute machine learning for an interaction function using an attention RNN (Attention RNN) or a multi-head attention (Multi-Head Attention).
  • the calculation unit 12 may calculate the feature vector M by an interaction function that does not include learning parameters.
  • the calculation unit 12 calculates the regression parameter a of the linear regression model from the feature vector M for each of the plurality of component objects.
  • the calculation unit 12 calculates the regression parameters by the machine learning model.
  • This machine learning model is a trained model.
  • the input vector and the output value have a one-to-one relationship.
  • the input vector is the feature vector M and the output value is the regression parameter a.
  • the calculation unit 12 inputs a set of a plurality of feature vectors M corresponding to the plurality of component objects into the machine learning model, and calculates the regression parameter a for each of the plurality of component objects.
  • the calculation unit 12 calculates the regression parameter a 1 corresponding to polystyrene, the regression parameter a 2 corresponding to polyacrylic acid, and the regression parameter a 3 corresponding to butyl polymethacrylic acid.
  • the machine learning model for calculating the regression parameters is not limited, and may be determined by any policy in consideration of factors such as the types of component objects and compound objects.
  • the calculation unit 12 may calculate the regression parameters using a fully coupled neural network (FCNN).
  • FCNN fully coupled neural network
  • the prediction unit 13 calculates the prediction value E by the following Scheffe polynomial (1) defined by the three regression parameters a 1 , a 2 , and a 3 .
  • the regression parameter a is the regression coefficient of the linear term of the equation (1).
  • the predicted value E indicates the characteristics of the multi-component substance (polymer alloy) obtained from polystyrene, polyacrylic acid, and butyl polymethacrylic acid.
  • the variable r in the equation (1) means a compound ratio.
  • the composite ratios of polystyrene, polyacrylic acid, and butyl polymethacrylic acid are represented as r1, r2 , and r3, respectively .
  • step S12 including step S121 and step S122 is the same as the example of FIG. 4, and steps S13 and S14 are different from the example of FIG.
  • the calculation unit 12 calculates the regression parameters of the linear regression model from the feature vector M for each of the plurality of component objects. Specifically, the calculation unit 12 calculates the regression parameter a of the primary term and the regression parameter b of the secondary term. In one example, the calculation unit 12 calculates the regression parameter by machine learning such as FCNN. Machine learning models are prepared for each of the linear and quadratic terms of the linear regression model.
  • the input vector and the output value have a one-to-one relationship.
  • the input vector is the feature vector M and the output value is the regression parameter a.
  • the calculation unit 12 inputs a set of a plurality of feature vectors M corresponding to the plurality of component objects into the machine learning model, and calculates the regression parameter a for each of the plurality of component objects. Also in the example of FIG. 5, the calculation unit 12 calculates the regression parameter a 1 corresponding to polystyrene, the regression parameter a 2 corresponding to polyacrylic acid, and the regression parameter a 3 corresponding to butyl polymethacrylic acid.
  • each input vector is obtained by synthesizing two feature vectors.
  • This function is a function that calculates one regression parameter from two vectors.
  • two feature vectors M are combined.
  • the calculation unit 12 synthesizes two feature vectors M 1 and M 2 to generate a first input vector, and synthesizes two feature vectors M 1 and M 3 to generate a second input vector. Is generated, and the two feature vectors M 2 and M 3 are combined to generate a third input vector.
  • the first input vector corresponds to polystyrene and polyacrylic acid
  • the second input vector corresponds to polystyrene and butyl polymethacrylic acid
  • the third input vector corresponds to polyacrylic acid and butyl polymethacrylic acid. handle.
  • the input vector and the output value have a one-to-one relationship.
  • the input vector is a composite of two feature vectors M and the output value is the regression parameter b.
  • the calculation unit 12 inputs all combinations of input vectors into the machine learning model and calculates the regression parameter b for each combination. In the example of FIG.
  • the calculation unit 12 has a regression parameter b 12 corresponding to the combination of polystyrene and polyacrylic acid, a regression parameter b 13 corresponding to the combination of polystyrene and butyl polymethacrylate, and polyacrylic acid and polymetha.
  • Regression parameter b 23 corresponding to the combination of butyl acrylate is calculated.
  • the prediction unit 13 determines the prediction value E by the following Scheffe polynomial (2) defined by the six regression parameters a 1 , a 2 , a 3 , b 12 , b 13 , and b 23 . calculate.
  • the regression parameter a is the regression coefficient of the first-order term
  • the regression parameter b is the regression coefficient of the second-order term.
  • the meaning of the variable r in the equation (2) is the compound ratio as in the equation (1).
  • the information processing system 10 may output individual regression parameters based on the feature vectors of all the related component objects for the regression model including the terms of the third order or higher or other parameters.
  • the information processing system 10 may output one regression parameter based on the feature vectors of all component objects.
  • the calculation unit 12 executes both the embedding function and the interaction function, but one of these two functions may be omitted.
  • the calculation unit 12 may calculate the regression parameter from the feature vector Z obtained by the machine learning model for the embedded function. In any case, the calculation unit 12 executes machine learning to calculate the regression parameters.
  • a machine learning model for embedded functions, a machine learning model for interaction functions, a machine learning model for regression parameters, and a regression model may be constructed by one neural network or multiple neural networks. It may be constructed by a set of networks.
  • the machine learning model for the embedded function, the machine learning model for the interaction function, and the machine learning model for the regression parameter may be constructed by one neural network, or may be constructed by a set of a plurality of neural networks. May be done.
  • the information processing program for making a computer or a computer system function as an information processing system 10 includes a program code for making the computer system function as an acquisition unit 11, a calculation unit 12, and a prediction unit 13.
  • This information processing program may be provided after being temporarily recorded on a tangible recording medium such as a CD-ROM, a DVD-ROM, or a semiconductor memory. Alternatively, the information processing program may be provided via a communication network as a data signal superimposed on a carrier wave.
  • the provided information processing program is stored in, for example, the auxiliary storage unit 103.
  • Each of the above functional elements is realized by the processor 101 reading the information processing program from the auxiliary storage unit 103 and executing the information processing program.
  • the information processing system includes at least one processor. At least one processor gets the numerical representations and compound ratios for each of the multiple component objects and performs machine learning based on the multiple numerical representations to calculate multiple regression parameters for the multiple component objects. Then, a plurality of composite ratios are applied to the regression model defined by the plurality of regression parameters, and a predicted value indicating the characteristics of the composite object obtained by combining the plurality of component objects is calculated.
  • the information processing method is executed by an information processing system including at least one processor.
  • This information processing method involves obtaining numerical representations and compound ratios for each of multiple component objects, and performing machine learning based on multiple numerical representations to perform multiple regression parameters for multiple component objects. And a step to calculate the predicted value indicating the characteristics of the composite object obtained by applying multiple composite ratios to the regression model defined by multiple regression parameters and combining multiple component objects. including.
  • the information processing program performs machine learning based on a step of acquiring a numerical expression and a compound ratio for each of a plurality of component objects, and machine learning based on the plurality of numerical expressions, to form a plurality of component objects.
  • a prediction that shows the characteristics of a composite object obtained by compounding multiple component objects by applying multiple composite ratios to the step of calculating the corresponding multiple regression parameters and the regression model defined by the multiple regression parameters. Have the computer perform the steps to calculate the value.
  • machine learning is executed based on the data of each component object, and a plurality of regression parameters corresponding to a plurality of component objects are calculated. Then, the composite ratio is applied to the regression model defined by the regression parameter, and the characteristics of the composite object are predicted.
  • the composite ratio can be changed and the characteristics of the composite object can be instantly recalculated by the regression model. That is, the calculated regression parameters can be reused.
  • At least one processor inputs a plurality of numerical representations into a first machine learning model to calculate a plurality of feature vectors corresponding to a plurality of component objects, and a plurality of feature vectors. May be input to the second machine learning model to calculate a plurality of regression parameters.
  • the first machine learning model may include a machine learning model for an embedded function and a machine learning model for an interaction function.
  • At least one processor inputs a plurality of numerical representations into a machine learning model for an embedded function, calculates a plurality of first feature vectors corresponding to a plurality of component objects, and interacts the plurality of first feature vectors.
  • By inputting into the machine learning model for multiple second feature vectors corresponding to multiple component objects are calculated, and by inputting multiple second feature vectors into the second machine learning model, multiple regression parameters are calculated. You may.
  • the machine learning model for the embedded function may be a machine learning model that generates a first feature vector, which is a fixed-length vector, from a numerical expression that is atypical data.
  • a first feature vector which is a fixed-length vector
  • feature vectors can be obtained from numerical representations that cannot be represented by fixed-length vectors.
  • the regression model may be Scheffe's polynomial.
  • At least one processor may calculate a plurality of regression coefficients of a linear term of Scheffé's polynomial as a plurality of regression parameters.
  • Scheffe's polynomial which is often dealt with in compounding problems, it is possible to accurately analyze a composite object obtained by compounding a plurality of component objects.
  • the regression coefficient of the linear term can be used to calculate a predicted value that takes into account the single degree of influence of the component object.
  • At least one processor may further calculate a plurality of regression coefficients of the quadratic term of the Scheffe polynomial as a plurality of regression parameters.
  • the regression coefficient of the quadratic term can be used to calculate a predicted value that further considers the degree of influence of the composition of the two component objects.
  • the component object may be a material and the composite object may be a multi-component substance.
  • the composite object may be a multi-component substance.
  • the material may be a polymer or a monomer
  • the multi-component substance may be a polymer alloy.
  • Polymers or monomers are very diverse and correspondingly there are a huge variety of polymer alloys. For such polymers, monomers, and polymer alloys, in general, only some of the possible combinations can be tested, and therefore sufficient data are often not available. According to this aspect, it is possible to analyze the polymer alloy with high accuracy even when the data is insufficient in this way.
  • the processing procedure of the information processing method executed by at least one processor is not limited to the example in the above embodiment. For example, some of the steps or processes described above may be omitted, or the steps may be performed in a different order. Further, any two or more steps among the above-mentioned steps may be combined, or a part of the steps may be modified or deleted. Alternatively, other steps may be performed in addition to each of the above steps.
  • the expression "at least one processor executes the first process, executes the second process, ... executes the nth process", or the expression corresponding thereto is the first.
  • a concept including a case where the processor that executes n processes from the first process to the nth process changes in the middle is shown. That is, this expression shows a concept including both a case where all n processes are executed by the same processor and a case where the processor changes according to an arbitrary policy in the n processes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2021/042833 2020-11-27 2021-11-22 情報処理システム、情報処理方法、および情報処理プログラム Ceased WO2022113945A1 (ja)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP21897918.5A EP4243026A4 (en) 2020-11-27 2021-11-22 Information processing system, information processing method, and information processing program
KR1020237021006A KR20230110584A (ko) 2020-11-27 2021-11-22 정보 처리 시스템, 정보 처리 방법, 및 정보 처리 프로그램
CN202180089147.0A CN116745850A (zh) 2020-11-27 2021-11-22 信息处理系统、信息处理方法及信息处理程序
JP2022565331A JP7803284B2 (ja) 2020-11-27 2021-11-22 情報処理システム、情報処理方法、および情報処理プログラム
US18/254,384 US20240047018A1 (en) 2020-11-27 2021-11-22 Information processing system, information processing method, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020197046 2020-11-27
JP2020-197046 2020-11-27

Publications (1)

Publication Number Publication Date
WO2022113945A1 true WO2022113945A1 (ja) 2022-06-02

Family

ID=81754598

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/042833 Ceased WO2022113945A1 (ja) 2020-11-27 2021-11-22 情報処理システム、情報処理方法、および情報処理プログラム

Country Status (6)

Country Link
US (1) US20240047018A1 (https=)
EP (1) EP4243026A4 (https=)
JP (1) JP7803284B2 (https=)
KR (1) KR20230110584A (https=)
CN (1) CN116745850A (https=)
WO (1) WO2022113945A1 (https=)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102457159B1 (ko) * 2021-01-28 2022-10-20 전남대학교 산학협력단 딥러닝 기반 화합물 의약 효과 예측 방법
US12587274B2 (en) 2023-03-28 2026-03-24 Quantum Generative Materials Llc Satellite optimization management system based on natural language input and artificial intelligence
KR20250080631A (ko) * 2023-11-28 2025-06-05 주식회사 Lg 경영개발원 인공지능을 이용한 다상을 갖는 소재의 특성 예측 시스템, 방법 및 프로그램
US12368503B2 (en) 2023-12-27 2025-07-22 Quantum Generative Materials Llc Intent-based satellite transmit management based on preexisting historical location and machine learning
US12603701B2 (en) 2023-12-27 2026-04-14 Quantum Generative Materials Llc Distributed satellite constellation management and control system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140163937A1 (en) * 2012-12-12 2014-06-12 Hyundai Motor Company Method for predicting physical properties of a composite blend of polypropylene and low density polypropylene
JP2019028879A (ja) 2017-08-02 2019-02-21 学校法人立命館 結合性予測方法、装置、プログラム、記録媒体、および機械学習アルゴリズムの製造方法
JP2020030638A (ja) * 2018-08-23 2020-02-27 パナソニックIpマネジメント株式会社 材料情報出力方法、材料情報出力装置、材料情報出力システム、及びプログラム
JP2020038493A (ja) * 2018-09-04 2020-03-12 横浜ゴム株式会社 物性データ予測方法及び物性データ予測装置
WO2020090805A1 (ja) * 2018-10-31 2020-05-07 昭和電工株式会社 材料探索装置、方法、およびプログラム
JP2020161044A (ja) * 2019-03-28 2020-10-01 日立化成株式会社 データ管理システム、データ管理方法、およびデータ管理プログラム

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11154196B2 (en) * 2017-06-20 2021-10-26 Siemens Healthcare Gmbh Deep-learnt tissue deformation for medical imaging
JP7218519B2 (ja) * 2018-09-04 2023-02-07 横浜ゴム株式会社 物性データ予測方法及び物性データ予測装置
US11631029B2 (en) * 2019-09-09 2023-04-18 Adobe Inc. Generating combined feature embedding for minority class upsampling in training machine learning models with imbalanced samples

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140163937A1 (en) * 2012-12-12 2014-06-12 Hyundai Motor Company Method for predicting physical properties of a composite blend of polypropylene and low density polypropylene
JP2019028879A (ja) 2017-08-02 2019-02-21 学校法人立命館 結合性予測方法、装置、プログラム、記録媒体、および機械学習アルゴリズムの製造方法
JP2020030638A (ja) * 2018-08-23 2020-02-27 パナソニックIpマネジメント株式会社 材料情報出力方法、材料情報出力装置、材料情報出力システム、及びプログラム
JP2020038493A (ja) * 2018-09-04 2020-03-12 横浜ゴム株式会社 物性データ予測方法及び物性データ予測装置
WO2020090805A1 (ja) * 2018-10-31 2020-05-07 昭和電工株式会社 材料探索装置、方法、およびプログラム
JP2020161044A (ja) * 2019-03-28 2020-10-01 日立化成株式会社 データ管理システム、データ管理方法、およびデータ管理プログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4243026A4

Also Published As

Publication number Publication date
KR20230110584A (ko) 2023-07-24
CN116745850A (zh) 2023-09-12
JPWO2022113945A1 (https=) 2022-06-02
US20240047018A1 (en) 2024-02-08
EP4243026A4 (en) 2024-05-15
EP4243026A1 (en) 2023-09-13
JP7803284B2 (ja) 2026-01-21

Similar Documents

Publication Publication Date Title
JP7509152B2 (ja) 情報処理システム、情報処理方法、および情報処理プログラム
WO2022113945A1 (ja) 情報処理システム、情報処理方法、および情報処理プログラム
CN111461168A (zh) 训练样本扩充方法、装置、电子设备及存储介质
JP7395974B2 (ja) 入力データ生成システム、入力データ生成方法、及び入力データ生成プログラム
Robert Approximate Bayesian computation: a survey on recent results
JP7509153B2 (ja) 情報処理システム、情報処理方法、および情報処理プログラム
JP7339924B2 (ja) 材料の特性値を推定するシステム
Ghafarollahi et al. Rapid and automated alloy design with graph neural network-powered llm-driven multi-agent systems
US20230273771A1 (en) Secret decision tree test apparatus, secret decision tree test system, secret decision tree test method, and program
JP7571781B2 (ja) 情報処理システム、情報処理方法、および情報処理プログラム
JP7494932B2 (ja) 秘密決定木テスト装置、秘密決定木テストシステム、秘密決定木テスト方法、及びプログラム
JP2020161044A (ja) データ管理システム、データ管理方法、およびデータ管理プログラム
Xavier et al. Genome assembly using reinforcement learning
CN119623516B (zh) 基于大语言模型的任务响应方法及装置、电子设备、介质
CN116523058B (zh) 量子门操作信息的获取方法及装置、量子计算机
JP2026068771A (ja) 異種樹脂材料間の相性予測システム、異種樹脂材料間の相性予測方法
WO2026078973A1 (ja) 異種樹脂材料間の相性予測システム、異種樹脂材料間の相性予測方法
Pierri et al. Beyond the Cox Model: Applying Machine Learning Techniques with Time-to-Event Data
CN117893316A (zh) 一种构建指数的量子方法及装置
Kassoul KNNOR-Reg: A Python Package for Oversampling in Imbalanced Regression
Eisenbach et al. Fast and Accurate Predictions of Total Energy for Solid Solution Alloys with Graph Convolutional Neural Networks
Styger AN EXPLORATION OF APPLYING KNOWLEDGE BASED ENGINEERING INTO A QUALITY MANAGEMENT FRAMEWORK-EXTENDING THE QUALITY TRIANGLE FOR ESTABLISHING THE FIRST PRINCIPLES OF KNOWLEDGE BUSINESS MODELLING
Constantinescu Fusion of data-and model-driven discovery August 16, 2017 Question addressed: Convergence of data-and model-driven discovery

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21897918

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022565331

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 18254384

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2021897918

Country of ref document: EP

Effective date: 20230609

ENP Entry into the national phase

Ref document number: 20237021006

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 202180089147.0

Country of ref document: CN