WO2021220776A1 - 材料の特性値を推定するシステム - Google Patents

材料の特性値を推定するシステム Download PDF

Info

Publication number
WO2021220776A1
WO2021220776A1 PCT/JP2021/015044 JP2021015044W WO2021220776A1 WO 2021220776 A1 WO2021220776 A1 WO 2021220776A1 JP 2021015044 W JP2021015044 W JP 2021015044W WO 2021220776 A1 WO2021220776 A1 WO 2021220776A1
Authority
WO
WIPO (PCT)
Prior art keywords
material property
dimensional
descriptor
materials
dimensional descriptor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/015044
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
拓也 金澤
秀和 森田
彰規 淺原
貴之 林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to EP21797897.2A priority Critical patent/EP4145328A4/en
Priority to US17/920,052 priority patent/US20230081583A1/en
Publication of WO2021220776A1 publication Critical patent/WO2021220776A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C60/00Computational materials science, i.e. ICT specially adapted for investigating the physical or chemical properties of materials or phenomena associated with their design, synthesis, processing, characterisation or utilisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/26Composites
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Definitions

  • the present invention relates to a system for estimating material property values.
  • a virtual screening method is used for a new material search task.
  • This method applies a machine learning model to the data of a known material (compound) to construct a characteristic value estimation model that estimates the characteristic value of the material. More specifically, from the chemical structural formula of the material, a descriptor expressing the characteristics of the material in multiple variables is generated. Furthermore, the characteristic value estimation model is constructed by learning the relationship between the descriptor and the characteristic value. The characteristic value estimation model estimates the characteristic value for the input descriptor.
  • the descriptor is composed of a plurality of elements (features), and each element represents a corresponding feature, for example, a molecular weight or an element mixture ratio.
  • the virtual screening method generates descriptors from the chemical structural formulas of many compounds whose characteristic values are unknown, and applies the above characteristic value estimation model to those descriptors. Screening is performed based on the characteristic values calculated in this way, and a chemical structural formula expected to have characteristic values exceeding the threshold value is presented as a promising compound that is a candidate for an experiment or simulation.
  • the user conducts experiments and simulations of materials selected from the candidates and evaluates those materials.
  • the number of experiments and simulations required for the material can be reduced, and the material having the desired characteristic value can be efficiently obtained.
  • Non-Patent Document 1 A method for generating a material descriptor is disclosed in, for example, Non-Patent Document 1 or Non-Patent Document 2. These disclose a method for discovering a descriptor consisting of a small number of combinations of descriptor elements useful for estimation in the field of inorganic chemistry from a descriptor element (feature amount) of several thousand to tens of thousands x.
  • virtual screening In order to find a compound having a desired characteristic value, virtual screening generates a huge number of candidate compound descriptors and estimates the characteristic value of the descriptor generated by the characteristic value estimation model. Further, the number of dimensions (number of elements) of the descriptor representing the characteristics of the chemical structural formula with high accuracy is very large, and is generally about 1000 to 4000.
  • One aspect of the present invention is a system for estimating characteristic values of materials, including one or more processors and one or more storage devices for storing programs executed by the one or more processors.
  • the one or more storage devices store the first learning model and the second learning model.
  • the one or more processors generate a low-dimensional descriptor consisting of a predetermined number of elements for each of the plurality of materials.
  • the one or more processors estimate the characteristic values of each of the plurality of materials from the low-dimensional descriptor by the first learning model.
  • the one or more processors select a part of the materials from the plurality of materials based on the characteristic values.
  • the one or more processors generate a high-dimensional descriptor in which the number of elements of each of the partial materials is larger than the predetermined number.
  • the one or more processors estimate the characteristic values of each of the part of the materials from the high-dimensional descriptor by the second learning model.
  • a promising material that can be expected to have a desired characteristic value can be selected more efficiently.
  • a logical configuration example of the material property estimation device is schematically shown.
  • An example of the hardware configuration of the material property estimation device is shown.
  • a flowchart of an example of the overall processing of the material property estimation device is shown.
  • An example of a graphical user interface for inputting material experiment data displayed on a monitor is schematically shown.
  • An example of the configuration of the experimental material database is shown.
  • An example of a graphical user interface for inputting a material list for which a material property value is estimated to be displayed on a monitor is schematically shown.
  • a configuration example of the material structural formula database is shown.
  • a detailed flowchart of the learning process of the low-dimensional material property estimation model is shown.
  • An example of the configuration of the descriptor list passed by the descriptor calculation unit to the material property estimation model learning unit is shown.
  • An example of the material property measurement value (material property measurement value list) acquired from the experimented material database by the material property estimation model learning unit is shown.
  • a configuration example of the material property estimation result (material property estimation value list) passed from the material property estimation unit to the material selection unit is shown.
  • a detailed flowchart of the learning process of the high-dimensional material property estimation model is shown.
  • An image example of the material property estimation result displayed on the monitor by the material property estimation result display unit is shown.
  • This system may be a physical computer system (one or more physical computers) or a system built on a group of computer resources (multiple computer resources) such as a cloud platform.
  • a computer system or computational resource group includes one or more interface devices (including, for example, communication devices and input / output devices), one or more storage devices (including, for example, memory (main storage) and auxiliary storage devices), and one or more. Includes the processor.
  • the process described with the function as the subject may be a process performed by a processor or a system having the processor.
  • the program may be installed from the program source.
  • the program source may be, for example, a program distribution computer or a computer-readable storage medium (eg, a computer-readable non-transient storage medium).
  • the description of each function is an example, and a plurality of functions may be combined into one function, or one function may be divided into a plurality of functions.
  • the material property estimation device narrows down the population of candidate materials in two stages.
  • the material property estimator calculates the low-dimensional descriptors for each of the candidate materials.
  • the material property estimator estimates material property values from each of the low-dimensional descriptors using a simple machine learning model.
  • the material property estimation device selects some materials based on the material property estimation values.
  • the material property estimation device calculates a high-dimensional descriptor for each of the selected materials.
  • the material property estimator estimates material property values from each of the high-dimensional descriptors using a more accurate machine learning model.
  • the material property estimation device finally selects a material to be presented to the user as a candidate based on these material property estimation values. In this way, by selecting the material that generates the high-dimensional descriptor based on the material property estimation result from the low-dimensional descriptor, the material that can be expected to have the desired material property is efficiently and quickly selected. be able to.
  • FIG. 1 schematically shows a logical configuration example of the material property estimation device according to the embodiment of the present specification.
  • the material property estimation device 100 stores the material structural formula database 105, the experimental material database 106, and the selected structural formula database 112.
  • the material property estimation device 100 includes an experimental data reception unit 103, a material list reception unit 104, a descriptor calculation unit 107, a material property estimation model learning unit 108, a material property estimation unit 109, a material selection unit 110, and a material property estimation result display. Includes part 111. These are programs, and one or more processors of the material property estimation device 100 can operate as functional units corresponding to the respective programs by executing these programs. Any function of the material property estimation device 100 can be implemented in any program.
  • the experimental data receiving unit 103 receives experimental data indicating characteristic values of various materials input to the user 200 via the input / output device, and stores the experimental data in the experimental material database 106.
  • the material list reception unit 104 receives the data of the chemical structural formulas of many materials input to the user 200 via the input / output device, and stores the data in the material structural formula database 105.
  • the material structural formula database 105 stores data of materials (chemical structural formulas) that are not stored in the experimental material database 106.
  • the descriptor calculation unit 107 generates a descriptor from the chemical structural formula by a predetermined method.
  • the descriptor represents the characteristics of the material represented by the chemical structural formula.
  • the descriptor is represented by a vector composed of a plurality of elements (features). Each element represents a corresponding feature, such as molecular weight or elemental mixing ratio.
  • the descriptor calculation unit 107 can generate a low-dimensional descriptor having a small number of elements and a high-dimensional descriptor having a large number of elements from one chemical structural formula.
  • the descriptor calculation unit 107 may be divided into modules for each of the low-dimensional descriptor and the high-dimensional descriptor.
  • the number of elements and the types of elements of each of the low-dimensional descriptor and the high-dimensional descriptor are constant, and the number of elements of the low-dimensional descriptor is smaller than the number of elements of the high-dimensional descriptor. All types of low-dimensional descriptor elements may be included in the high-dimensional descriptor element types, and some or all of the low-dimensional descriptor elements may be included in the high-dimensional descriptor element types. It may be different.
  • the descriptor calculation unit 107 may determine the importance of the elements of the high-dimensional descriptor with respect to the characteristic value estimation, and may determine the elements constituting the low-dimensional descriptor based on the importance. As a result, a more appropriate candidate material can be selected by estimating the material property value from the low-dimensional descriptor.
  • the descriptor calculation unit 107 uses the high-dimensional descriptor to perform learning with a decision tree ensemble learner such as random forest or gradient boosting, and calculates the importance of each element of the high-dimensional descriptor.
  • the descriptor calculation unit 107 selects a predetermined number of descriptors in order from the element having the highest importance.
  • the descriptor calculation unit 107 may use linear regression by Permutation Impact or Lasso to determine the importance of the elements of the high-dimensional descriptor.
  • the machine learning model for selecting the elements of the low-dimensional descriptor may be the same as or different from the machine learning model for estimating material property values from the high-dimensional descriptor.
  • the algorithms between them may also be the same or different.
  • the material property estimation model learning unit 108 learns a material property estimation model (learning model) capable of estimating a predetermined characteristic value from the descriptor of the chemical structural formula (material).
  • the material property estimation device 100 includes a low-dimensional material property estimation model (first learning model) that estimates one or more predetermined types of property values from a low-dimensional descriptor.
  • a high-dimensional material property estimation model (second learning model) that estimates similar types of property values from the high-dimensional descriptor is prepared.
  • a plurality of low-dimensional material property estimation models may be prepared, the number of dimensions thereof may be different or common, and the combination of element types may be the same or different among the models.
  • the dimensions of the low-dimensional descriptors of any low-dimensional material property estimation model are also smaller than the number of dimensions of the high-dimensional descriptors.
  • the material property estimation model can be configured to estimate one or more types of property values. In the examples described below, the material property estimation model assumes (outputs) a single property value.
  • the regression algorithms used by the low-dimensional material property estimation model and the high-dimensional material property estimation model are arbitrary, and these algorithms may be the same or different. For example, any algorithm can be selected from various regression algorithms including random forest, support vector machine, Gaussian process regression, and neural network.
  • the material property estimation unit 109 uses the trained low-dimensional material property estimation model to obtain the estimated material property value from the low-dimensional descriptor. Further, the material property estimation unit 109 uses the trained high-dimensional material property estimation model to obtain the estimated material property value from the high-dimensional descriptor.
  • the low-dimensional descriptor is generated from all the chemical structural formulas (materials) stored in the material structural formula database 105. High-dimensional descriptors are generated for low-dimensional descriptor materials whose estimated material property values are close to ideal.
  • the material selection unit 110 has selected a material (chemical structural formula) that generates a high-dimensional descriptor based on the material property values estimated for the low-dimensional descriptor, and has selected the information (chemical structural formula). It is stored in the structural formula database 112.
  • the criteria for material selection depends on the nature of the material property value and the user's request. For example, a predetermined number of materials having a property estimate value whose material property value is close to the target value or a material contained within a predetermined range are selected. You may.
  • the material selection unit 110 may select a predetermined number of materials showing the highest material property value, or select a material whose material property value exceeds a predetermined threshold value. May be good.
  • the material selection unit 110 may select a predetermined number of materials showing the lowest material property value, or select a material having a material property value less than a predetermined threshold value. May be good.
  • a high-dimensional descriptor is created based on the statistical values of the estimates of these multiple low-dimensional material property estimation models (for example, the weighted average value (including the average value)).
  • the material to be produced may be selected.
  • the material property estimation result display unit 111 acquires the material property value estimation result from the high-dimensional descriptor of the selected material by the high-dimensional material property value estimation model.
  • the material property estimation result display unit 111 presents a promising material to the user 200 by displaying the material property value estimation result together with the corresponding chemical structure.
  • the material property estimation result display unit 111 may display the estimation results of all the selected materials, or may display the estimation results of only some materials showing the preferable estimated values selected by a predetermined criterion. good.
  • FIG. 2 shows an example of the hardware configuration of the material property estimation device 100.
  • the material property estimation device 100 includes a processor 151 having arithmetic performance and a DRAM 152 providing a volatile temporary storage area for storing programs and data executed by the processor 151.
  • the material property estimation device 100 further includes a communication device 153 that performs data communication with another device, and an auxiliary storage device 154 that provides a permanent information storage area using an HDD (Hard Disk Drive), a flash memory, or the like. include.
  • HDD Hard Disk Drive
  • the auxiliary storage device 154 includes an experimental data reception unit 103, a material list reception unit 104, a descriptor calculation unit 107, a material property estimation model learning unit 108, a material property estimation unit 109, a material selection unit 110, and a material property estimation result.
  • Stores programs such as the display unit 111.
  • the auxiliary storage device 154 further stores various data such as the material structural formula database 105, the experimental material database 106, and the selected structural formula database 112.
  • the program executed by the processor 151 and the data to be processed are loaded from the auxiliary storage device 154 into the DRAM 152.
  • the material property estimation device 100 includes an input device 155 that accepts an operation from the user and a monitor 156 (an example of the output device) that presents the output result in each process to the user.
  • the function of the material property estimation device 100 may be implemented separately in a plurality of devices.
  • the material property estimation device 100 includes one or more storage devices and one or more processors.
  • FIG. 3 shows a flowchart of an example of the overall processing of the material property estimation device 100.
  • the experiment data receiving unit 103 receives the material experiment data from the user 102 via the input device 155 and stores it in the experimented material database 106.
  • the material list receiving unit 104 receives the material list from the user 102 via the input device 155 and stores it in the material structural formula database 105.
  • FIG. 4 schematically shows an example of a graphical user interface (GUI) 201 for inputting material experiment data displayed on the monitor 156.
  • GUI graphical user interface
  • the user inputs necessary information from the input device 155 to the GUI 201.
  • the user can specify the file in which the experiment data is stored by using the "reference button” in the GUI 201, and select the "OK” button to instruct the file to the experiment data receiving unit 103.
  • the experiment data receiving unit 103 stores the data of the designated file in the experimented material database 106.
  • FIG. 5 shows a configuration example of the experimental material database 106.
  • the experimented material database 106 correlates the material with the experimental result of the property value of the material.
  • the experimental material database 106 includes a number column 251, a structural formula (SMILES) column 252, and a material property measurement value column 253.
  • SILES structural formula
  • Structural formula (SMILES) column 252 represents the chemical structural formula of the material. In the example of FIG. 4, the chemical structural formula is expressed according to the SMILES (Simplified Molecular Input Line Entry System) notation. Any representational form of the chemical structural formula that can generate a descriptor can be used.
  • the material property measurement value column 253 shows the experimental results of predetermined property values of each chemical structural formula. Part or all of the measured values stored in the experimental material database 106 (measurement database) may be the measured values of the simulation results.
  • FIG. 6 schematically shows an example of GUI 202 for inputting a material list for material property value estimation target displayed on the monitor 156.
  • the user inputs necessary information from the input device 155 to the GUI 202.
  • the GUI 202 the user can specify the file storing the material list by using the "reference button” and select the "OK" button to instruct the material list reception unit 104 of the file. ..
  • the material list reception unit 104 stores the data of the specified file in the material structural formula database 105.
  • FIG. 7 shows a configuration example of the material structural formula database 105.
  • the material structural formula database 105 stores the chemical structural formula for which the material property value is estimated.
  • low-dimensional descriptors of all materials stored in the material structural formula database 105 are generated, and the material property values thereof are estimated.
  • some materials (chemical structural formulas) stored in the material structural formula database 105 may be selected to generate low-dimensional descriptors. For example, a predetermined number of materials may be randomly selected.
  • the material structural formula database 105 includes a number column 261 and a structural formula (SMILES) column 262.
  • the number column 261 identifies each record in the material structural formula database 105.
  • Structural formula (SMILES) column 262 represents the SMILES representation of the chemical structural formula of the material.
  • step S103 the material property estimation model learning unit 108 learns (trains) the low-dimensional material property estimation model and passes it to the learned material property estimation unit 109.
  • the material property estimation model learning unit 108 learns (trains) the low-dimensional material property estimation model and passes it to the learned material property estimation unit 109.
  • FIG. 8 shows a detailed flowchart of the learning process (S103) of the low-dimensional material property estimation model.
  • the descriptor calculation unit 107 acquires some or all materials (chemical structural formulas) from the experimented material database 106, and each of them is low. Calculate the dimensional descriptor.
  • the number of materials (amount of training data) acquired from the experimental material database 106 for training the low-dimensional material property estimation model is the number of materials acquired for the high-dimensional material property estimation model described later. Less than (amount of training data). Since the number of dimensions of the low-dimensional material property estimation model is smaller than the number of dimensions of the high-dimensional material property estimation model, it is possible to efficiently and appropriately train with less training data than the high-dimensional material property estimation model.
  • the data acquired from the experimental material database 106 shows the value of the number column 251 and the value of the structural formula column 252 of the experimental material database 106.
  • the types and numbers of descriptor elements constituting the low-dimensional descriptor are preset in the device or selected from the elements of the high-dimensional descriptor based on the importance and set. ..
  • step S202 the material property estimation model learning unit 108 receives the low-dimensional descriptor calculated from the descriptor calculation unit 107, and from the experimented material database 106, the chemistry corresponding to the calculated low-dimensional descriptor. Acquire the material property value (list of material property measurement value) of the structural formula.
  • FIG. 9 shows a configuration example of a descriptor list passed by the descriptor calculation unit 107 to the material property estimation model learning unit 108.
  • FIG. 9 shows an example of the descriptor list 300 of the low-dimensional descriptor by the descriptor calculation unit 107, but the table structure of the high-dimensional descriptor list is the same except that the number of descriptor elements is small.
  • the descriptor list 300 includes a number column 301 and a column for each of the descriptor elements.
  • the value in number column 301 corresponds to the value in number column 251 in the experimental material database 106.
  • Experimental Data In the example of FIG. 9, the descriptor is composed of 4000 descriptive elements, and the columns of the four descriptor elements are indicated by reference numerals 302 to 305 as an example.
  • FIG. 10 shows an example of the material property measurement value (material property measurement value list) 330 acquired by the material property estimation model learning unit 108 from the experimented material database 106.
  • the material property measurement value list includes a number column 331 and a material property measurement value column 332.
  • the value in number column 331 corresponds to the value in number column 251 in the experimental material database 106.
  • the value of the material property measurement value column 332 corresponds to the value of the material property measurement value column 253.
  • the material property estimation model learning unit 108 learns the low-dimensional material property estimation model from the acquired low-dimensional descriptor and the material property value.
  • the material property estimation model learning unit 108 holds information on the initial configuration of the low-dimensional material property estimation model in advance, and configures the low-dimensional material property estimation model according to the information.
  • any kind of machine learning model can be used as the low-dimensional material property estimation model.
  • the material property estimation model learning unit 108 sequentially inputs the low-dimensional descriptor into the low-dimensional material property estimation model, and acquires the output material property value estimation value.
  • the material property estimation model learning unit 108 optimizes the low-dimensional material property estimation model by updating the parameters of the low-dimensional material property estimation model based on the error between the material property value estimation value and the acquired material property measurement value. To become.
  • the material property estimation model learning unit 108 passes the learned low-dimensional material property estimation model to the material property estimation unit 109.
  • step S104 the descriptor calculation unit 107 acquires a part or all of the chemical structural formulas (records) from the material structural formula database 105 in response to the instruction from the material property estimation unit 109. , Compute each low-dimensional descriptor.
  • the correspondence between the numbers and the chemical structural formulas is the same as that of the material structural formula database 105 shown in FIG.
  • the material property estimation unit 109 receives the low-dimensional descriptor calculated from the descriptor calculation unit 107 and estimates the material property. Specifically, the material property estimation unit 109 inputs each of the acquired low-dimensional descriptors into the trained low-dimensional material property estimation model, and acquires the corresponding characteristic estimation value.
  • the material selection unit 110 receives the material property estimation result from the material property estimation unit 109, and further acquires the chemical structural formula of the number indicated by the received estimation result from the material structural formula database 105.
  • the material selection unit 110 selects a material (chemical structural formula) based on the material property estimation result, and stores the selected chemical structural formula in the selected structural formula database 112.
  • the data structure of the selected structural formula database 112 is the same as that of the material structural formula database 105, and the number associated with the chemical structural formula may be the same.
  • FIG. 11 shows a configuration example of the material property estimation result (material property estimation value list) 340 passed from the material property estimation unit 109 to the material selection unit 110.
  • the material property estimate list 340 includes a number column 341 and a material property estimate column 342.
  • the value of the number column 341 corresponds to the value of the number column 261 of the material structural formula database 105.
  • the material property estimation value column 342 shows the material property estimation value of each of the chemical structural formulas of the numbers indicated by the number column 341.
  • the material selection unit 110 refers to the material property estimation value list 340, selects a material having a property estimation value that matches the preset conditions, and stores the chemical structural formula of the selected material in the selected structural formula database 112. do.
  • step S107 the material property estimation model learning unit 108 learns (trains) the high-dimensional material property estimation model, and passes the learned high-dimensional material property estimation model to the material property estimation unit 109. ..
  • FIG. 12 shows a detailed flowchart of the learning process (S107) of the high-dimensional material property estimation model.
  • the descriptor calculation unit 107 acquires some or all materials (chemical structural formulas) from the experimented material database 106, and each height is high. Calculate the dimensional descriptor.
  • the data obtained from the experimented material database 106 indicates the value of the number column 251 and the value of the structural formula column 252 of the experimented material database 106.
  • the types and numbers of descriptor elements that make up the high-dimensional descriptor are preset.
  • the material property estimation model learning unit 108 receives the high-dimensional descriptor (descriptor list) calculated from the descriptor calculation unit 107.
  • the structure of the descriptor list passed by the descriptor calculation unit 107 to the material property estimation model learning unit 108 is the same as that of the descriptor list 300 shown in FIG. 9, and the records between them may be the same or different.
  • the material property estimation model learning unit 108 acquires the material property value (material property measurement value list) of the chemical structural formula corresponding to the calculated high-dimensional descriptor from the experimented material database 106.
  • the structure of the material property measurement value list is the same as that of the material property measurement value list 330 shown in FIG.
  • the number in the material property measurement list (corresponding material) matches the number in the higher dimensional descriptor list (corresponding material).
  • the material property estimation model learning unit 108 learns the high-dimensional material property estimation model from the acquired high-dimensional descriptor and the material property value.
  • the material property estimation model learning unit 108 holds information on the initial configuration of the high-dimensional material property estimation model in advance, and configures the high-dimensional material property estimation model according to the information.
  • any kind of machine learning model can be used as a high-dimensional material property estimation model.
  • the material property estimation model learning unit 108 sequentially inputs high-dimensional descriptors to the high-dimensional material property estimation model and acquires the output material property value estimation value.
  • the material property estimation model learning unit 108 optimizes the high-dimensional material property estimation model by updating the parameters of the high-dimensional material property estimation model based on the error between the material property value estimation value and the acquired material property measurement value. To become.
  • the material property estimation model learning unit 108 passes the learned high-dimensional material property estimation model to the material property estimation unit 109.
  • step S108 in response to the instruction of the material property estimation unit 109, the descriptor calculation unit 107 acquires the chemical structural formulas from the selected structural formula database 112, and the higher-dimensional descriptors thereof are obtained. To calculate. By calculating the high-dimensional descriptor of only the material selected based on the characteristic estimation value by the low-dimensional descriptor, the calculation of the high-dimensional descriptor and the subsequent calculation of the characteristic estimation value can be speeded up.
  • the material property estimation unit 109 receives the high-dimensional descriptor of the chemical structural formula stored in the selected structural formula database 112 from the descriptor calculation unit 107, and estimates the material properties. Specifically, the material property estimation unit 109 sequentially inputs the acquired high-dimensional descriptor into the trained high-dimensional material property estimation model. The high-dimensional material property estimation model outputs the material property estimation value of each input high-dimensional descriptor.
  • the material property estimation result display unit 111 receives the material property estimation result of the selected chemical structural formula from the material property estimation unit 109.
  • the material property estimation result display unit 111 further acquires the chemical structural formula from the selected structural formula database 112.
  • the material property estimation result display unit 111 presents the acquired material property estimation result and the chemical structural formula to the user.
  • FIG. 13 shows an image example of the material property estimation result displayed on the monitor 156 by the material property estimation result display unit 111.
  • the image shows the chemical structural formulas of the selected materials and their corresponding estimated material property values.
  • the user can determine the chemical structural formula to actually execute the experiment or simulation with reference to the displayed chemical structural formula and material property value.
  • the save button saves the estimation result.
  • the present invention is not limited to the above-described embodiment, and includes various modifications.
  • the above-described embodiment has been described in detail in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to the one including all the configurations described.
  • it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment and it is also possible to add the configuration of another embodiment to the configuration of one embodiment.
  • each of the above-mentioned configurations, functions, processing units, etc. may be realized by hardware, for example, by designing a part or all of them with an integrated circuit.
  • each of the above configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function.
  • Information such as programs, tables, and files that realize each function can be placed in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card or an SD card.
  • control lines and information lines indicate those that are considered necessary for explanation, and not all control lines and information lines are necessarily indicated on the product. In practice, it can be considered that almost all configurations are interconnected.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
PCT/JP2021/015044 2020-04-28 2021-04-09 材料の特性値を推定するシステム Ceased WO2021220776A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21797897.2A EP4145328A4 (en) 2020-04-28 2021-04-09 System that estimates characteristic value of material
US17/920,052 US20230081583A1 (en) 2020-04-28 2021-04-09 System for predicting material property value

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020-079793 2020-04-28
JP2020079793A JP7339924B2 (ja) 2020-04-28 2020-04-28 材料の特性値を推定するシステム

Publications (1)

Publication Number Publication Date
WO2021220776A1 true WO2021220776A1 (ja) 2021-11-04

Family

ID=78279919

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/015044 Ceased WO2021220776A1 (ja) 2020-04-28 2021-04-09 材料の特性値を推定するシステム

Country Status (4)

Country Link
US (1) US20230081583A1 (enExample)
EP (1) EP4145328A4 (enExample)
JP (1) JP7339924B2 (enExample)
WO (1) WO2021220776A1 (enExample)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024252858A1 (ja) * 2023-06-07 2024-12-12 ソニーグループ株式会社 制御装置、制御方法および非一時的記憶媒体
US12368503B2 (en) 2023-12-27 2025-07-22 Quantum Generative Materials Llc Intent-based satellite transmit management based on preexisting historical location and machine learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018098588A1 (en) * 2016-12-02 2018-06-07 Lumiant Corporation Computer systems for and methods of identifying non-elemental materials based on atomistic properties
WO2019060268A1 (en) * 2017-09-19 2019-03-28 Covestro Llc CUSTOM DESIGN TECHNIQUES FOR PRODUCTS
WO2020031671A1 (ja) * 2018-08-08 2020-02-13 パナソニックIpマネジメント株式会社 材料記述子生成方法、材料記述子生成装置、材料記述子生成プログラム、予測モデル構築方法、予測モデル構築装置及び予測モデル構築プログラム
WO2020075573A1 (ja) * 2018-10-10 2020-04-16 国立研究開発法人物質・材料研究機構 予測管理システム、予測管理方法、データ構造、予測管理装置及び予測実行装置
JP2020079793A (ja) 2014-06-30 2020-05-28 バイオデシー, インコーポレイテッド 生物学的実体におけるコンホメーションのハイスループット分析のためのシステムおよび方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020079793A (ja) 2014-06-30 2020-05-28 バイオデシー, インコーポレイテッド 生物学的実体におけるコンホメーションのハイスループット分析のためのシステムおよび方法
WO2018098588A1 (en) * 2016-12-02 2018-06-07 Lumiant Corporation Computer systems for and methods of identifying non-elemental materials based on atomistic properties
WO2019060268A1 (en) * 2017-09-19 2019-03-28 Covestro Llc CUSTOM DESIGN TECHNIQUES FOR PRODUCTS
WO2020031671A1 (ja) * 2018-08-08 2020-02-13 パナソニックIpマネジメント株式会社 材料記述子生成方法、材料記述子生成装置、材料記述子生成プログラム、予測モデル構築方法、予測モデル構築装置及び予測モデル構築プログラム
WO2020075573A1 (ja) * 2018-10-10 2020-04-16 国立研究開発法人物質・材料研究機構 予測管理システム、予測管理方法、データ構造、予測管理装置及び予測実行装置

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
L.M. GHIRINGHELLI ET AL.: "Big Data of Materials Science: Critical Role of the Descriptor", PHYS. REV. LETT., vol. 114, 2015, pages 105503
MORIKAWA KOJI: "Application of materials informatics to inorganic compounds", JOURNAL OF THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE, vol. 34, no. 3, 1 May 2019 (2019-05-01), pages 364 - 369, XP055868753 *
R. OUYANG ET AL.: "SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates", PHYS. REV. MATERIALS, vol. 2, 2018, pages 083802
See also references of EP4145328A4

Also Published As

Publication number Publication date
JP2021174403A (ja) 2021-11-01
JP7339924B2 (ja) 2023-09-06
EP4145328A4 (en) 2024-07-10
US20230081583A1 (en) 2023-03-16
EP4145328A1 (en) 2023-03-08

Similar Documents

Publication Publication Date Title
US12056583B2 (en) Target variable distribution-based acceptance of machine learning test data sets
Chen et al. Machine learning-based configuration parameter tuning on hadoop system
Baldán et al. Distributed FastShapelet Transform: a Big Data time series classification algorithm
WO2021220775A1 (ja) 材料の特性値を推定するシステム
EP3038018A1 (en) Clustering database queries for runtime prediction
US12271797B2 (en) Feature selection for model training
CN114207729B (zh) 材料特性预测系统以及材料特性预测方法
CN120548538A (zh) 使用大语言模型的自动化机器学习
US12204986B2 (en) Generating quantum service definitions from executing quantum services
JP2017146888A (ja) 設計支援装置及び方法及びプログラム
US20190205361A1 (en) Table-meaning estimating system, method, and program
WO2021220776A1 (ja) 材料の特性値を推定するシステム
Datseris et al. Framework for global stability analysis of dynamical systems
EP4609291A1 (en) Load testing and performance benchmarking for large language models using a cloud computing platform
CN117813602A (zh) 主成分分析
JP5555238B2 (ja) ベイジアンネットワーク構造学習のための情報処理装置及びプログラム
Lupo Pasini et al. Fast and accurate predictions of total energy for solid solution alloys with graph convolutional neural networks
JP7452648B2 (ja) 学習方法、学習装置及びプログラム
CN115935934B (zh) 文档生成方法、装置、设备、存储介质和计算机程序产品
Zou et al. Sparse logistic regression with logical features
JP7207423B2 (ja) 作業集合選択装置、作業集合選択方法および作業集合選択プログラム
KR20200015300A (ko) 신경망 피처 벡터 결정 장치 및 방법
CN118235145A (zh) 模型生成装置、模型生成方法及数据估计装置
KR101602170B1 (ko) 효율적인 메모리 사용을 위한 대용량 데이터 공유 시스템 및 방법
WO2021106202A1 (ja) 学習装置、学習方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21797897

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021797897

Country of ref document: EP

Effective date: 20221128