JP7122699B2

JP7122699B2 - Material information output method, material information output device, material information output system, and program

Info

Publication number: JP7122699B2
Application number: JP2018156142A
Authority: JP
Inventors: 好秀澤田; 幸治森川; 透中田
Original assignee: Panasonic Intellectual Property Management Co Ltd
Current assignee: Panasonic Intellectual Property Management Co Ltd
Priority date: 2018-08-23
Filing date: 2018-08-23
Publication date: 2022-08-22
Anticipated expiration: 2038-08-23
Also published as: JP2020030638A

Description

本開示は、材料に関する情報を出力する技術に関するものである。 The present disclosure relates to technology for outputting information about materials.

近年、人工知能技術を用いて新規材料の候補を材料研究者に提示する研究が盛んに行われている。例えば、特許文献１には、問い合わせ蛋白質と任意の化合物とのペアが合成可能か否かを判定する技術が記載されている。 In recent years, many studies have been conducted to present new material candidates to materials researchers using artificial intelligence technology. For example, Patent Literature 1 describes a technique for determining whether or not a pair of a query protein and an arbitrary compound can be synthesized.

特許第５９４６０４５号公報Japanese Patent No. 5946045

しかしながら、特許文献１では、実際には合成可能なペアの組成式であるにも拘わらず、合成できないと判定されるケースが起こり得るため、更なる改善が必要である。 However, in Patent Literature 1, there may be cases where it is determined that synthesis is not possible even though the composition formula of a pair is actually synthesizable, so further improvement is necessary.

本開示は、実際には合成可能な未知の組成式が合成不可能と判断されることを防止する技術を提供することを目的とする。 An object of the present disclosure is to provide a technique for preventing an unknown composition formula that is actually synthesizable from being determined to be unsynthesizable.

本開示の一態様に係る材料情報出力方法は、材料に関する情報を出力する材料情報出力装置における材料情報出力方法であって、
前記材料情報出力装置は、既知組成式リストを学習することで生成された疑似組成式の生成モデルが学習過程で生成した疑似組成式を、学習段階を示す数値と対応付けて予め記憶するメモリを備え、
前記材料情報出力装置のプロセッサが、
前記メモリに記憶された前記疑似組成式の合成可能性を対応する前記数値を用いて算出し、算出した前記合成可能性を前記メモリに記憶された前記疑似組成式の教師ラベルとして付与し、
前記教師ラベルが付与された前記疑似組成式を学習することで前記合成可能性を推定するための推定モデルを生成し、
学習した前記推定モデルを用いて任意の組成式に対する前記合成可能性を算出し、
算出した前記合成可能性を出力する。 A material information output method according to an aspect of the present disclosure is a material information output method in a material information output device that outputs information about materials,
The material information output device has a memory for pre-storing pseudo compositional formulas generated in the learning process by a pseudo compositional formula generation model generated by learning a list of known compositional formulas in association with numerical values indicating learning stages. prepared,
The processor of the material information output device,
calculating the synthesizability of the pseudo composition formula stored in the memory using the corresponding numerical values, and assigning the calculated synthesizability as a teacher label to the pseudo composition formula stored in the memory;
generating an estimation model for estimating the synthesis possibility by learning the pseudo-composition formula to which the teacher label is assigned;
calculating the synthesis possibility for an arbitrary composition formula using the learned estimation model;
Output the calculated synthesis possibility.

本開示によれば、実際には合成可能な未知の組成式が合成不可能と判断されることを防止することができる。 According to the present disclosure, it is possible to prevent an unknown composition formula that is actually synthesizable from being determined to be unsynthesizable.

実施の形態１に係る材料情報出力システムの構成の一例を示すブロック図である。1 is a block diagram showing an example of the configuration of a material information output system according to Embodiment 1; FIG. 既知組成式リストのデータ構成の一例を示す図である。FIG. 4 is a diagram showing an example of the data structure of a list of known composition formulas; 周期表に従って算出される特徴量の一例を示す図である。It is a figure which shows an example of the feature-value calculated according to a periodic table. ニューラルネットワークが行う演算の計算モデルを説明するための概念図である。FIG. 2 is a conceptual diagram for explaining a calculation model of operations performed by a neural network; ニューラルネットワークが識別問題を学習する様子を説明する図である。FIG. 4 is a diagram explaining how a neural network learns a classification problem; ニューラルネットワークが回帰問題を学習する様子を説明する図である。FIG. 4 is a diagram explaining how a neural network learns a regression problem; ニューラルネットワークが生成問題を学習する様子を説明する図である。FIG. 4 is a diagram explaining how a neural network learns a generation problem; 図１に示す生成モデル学習部の詳細な構成の一例を示す図である。2 is a diagram showing an example of a detailed configuration of a generative model learning unit shown in FIG. 1; FIG. 識別部を構成するニューラルネットワークが学習する様子を説明する図である。FIG. 4 is a diagram for explaining how a neural network that constitutes an identification unit learns; 図１に示す材料情報出力システムの学習過程における処理の一例を示すフローチャートである。2 is a flow chart showing an example of processing in a learning process of the material information output system shown in FIG. 1; 図１に示す材料情報出力システムの活用過程における処理の一例を示すフローチャートである。2 is a flow chart showing an example of processing in the process of utilizing the material information output system shown in FIG. 1; 合成可能性をユーザに提示するための画像の一例を示す図である。FIG. 10 is a diagram showing an example of an image for presenting the possibility of synthesis to the user; 実施の形態１の変形例に係る材料情報出力システムの構成の一例を示すブロック図である。FIG. 4 is a block diagram showing an example of the configuration of a material information output system according to a modification of Embodiment 1; FIG. 本開示の実施の形態２に係る材料情報出力システムの構成の一例を示すブロック図である。FIG. 10 is a block diagram showing an example of a configuration of a material information output system according to Embodiment 2 of the present disclosure; 図１４に示す材料情報出力システムの学習過程における処理の一例を示すフローチャートである。15 is a flowchart showing an example of processing in the learning process of the material information output system shown in FIG. 14; 投票処理の詳細を示すフローチャートである。10 is a flowchart showing details of voting processing; 元素の族を説明する図である。It is a figure explaining the group of an element. 実施の形態２に係る既知組成式リストのデータ構成の一例を示す図である。FIG. 10 is a diagram showing an example of a data structure of a list of known composition formulas according to Embodiment 2; 本開示の実施の形態３に係る材料情報出力システムの構成の一例を示すブロック図である。FIG. 11 is a block diagram showing an example of a configuration of a material information output system according to Embodiment 3 of the present disclosure; 本開示の実施の形態４に係る材料情報出力システムの構成の一例を示すブロック図である。FIG. 11 is a block diagram showing an example of a configuration of a material information output system according to Embodiment 4 of the present disclosure; 表示部が表示する散布図の一例を示す図である。It is a figure which shows an example of the scatter diagram which a display part displays. 本実施の形態４に係る材料情報出力システムの活用過程における処理の一例を示すフローチャートである。FIG. 13 is a flow chart showing an example of processing in the process of utilizing the material information output system according to the fourth embodiment; FIG. 材料情報出力システムを実現するためのハードウェア構成の一例を示すブロック図である。It is a block diagram showing an example of hardware constitutions for realizing a material information output system. 本開示の材料情報出力システムのハードウェア構成の他の一例を示す図である。It is a figure which shows another example of the hardware configuration of the material information output system of this indication.

（本開示に至る経緯）
近年、材料インフォマティクス分野が脚光を浴びつつある。材料インフォマティクスとは、材料研究の研究スピードを加速させるための、機械学習に代表される人工知能技術（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ：ＡＩ）を指す。材料インフォマティクスでは、機械学習技術によって構築されたモデルを利用して、新規材料の候補を材料研究者に提示し、材料研究者は提示された新規材料の候補の中から所望の材料を作成（合成）する。しかしながら、実際の材料の合成には膨大な時間を必要とするため、提示された新規材料の候補の中から所望の材料を実際に作成するのは、手戻りが発生することもあり、非効率的である。そのため、材料インフォマティクスでは、新規材料の候補に対する合成可能性を推定し、新規材料の候補と合わせて材料研究者に提示することが重要となる。 (Background leading up to this disclosure)
In recent years, the field of materials informatics has been in the spotlight. Materials informatics refers to Artificial Intelligence (AI) represented by machine learning for accelerating the research speed of materials research. In materials informatics, candidates for new materials are presented to materials researchers using models constructed by machine learning technology, and materials researchers create (synthesize) desired materials from the presented candidates for new materials. )do. However, since it takes an enormous amount of time to actually synthesize materials, it is inefficient to actually create a desired material from among the proposed new material candidates, as it may require rework. target. Therefore, in materials informatics, it is important to estimate the synthesis feasibility of new material candidates and present them to materials researchers together with new material candidates.

上述した特許文献１では、相互作用することが既知の蛋白質と化合物とのペアである第１ペアを正例の教師データとし、蛋白質と化合物とをランダムに組み合わせた第２ペアを負例の教師データとし、サポートベクターマシン等の機械学習技術を用いて、相互作用学習モデルを生成する。そして、この相互作用学習モデルを参照して、問い合わせ蛋白質と化合物とをランダムに組み合わせて得られる複数のペアについての相互作用の可能性を示すスコアを算出する技術が開示されている。 In Patent Document 1 described above, the first pair, which is a pair of a protein and a compound that are known to interact, is used as teacher data for positive examples, and the second pair, which is a combination of proteins and compounds at random, is used as teacher data for negative examples. We use machine learning techniques such as support vector machines to generate interactive learning models. Then, a technique is disclosed in which, referring to this interaction learning model, a score indicating the possibility of interaction is calculated for a plurality of pairs obtained by randomly combining query proteins and compounds.

ところで、蛋白質と化合物とをランダムに組み合わせたペアを組成式として作成した場合、当該組成式は既知ではないが実際には合成可能であるケースも数多くある。 By the way, when a pair of randomly combined proteins and compounds is created as a compositional formula, there are many cases in which synthesis is actually possible although the compositional formula is not known.

しかし、特許文献１では、このような組成式が実際には正例の教師データとして取り扱われるべきところを、負例の教師データとして取り扱われて、相互作用学習モデルが学習されている。そのため、学習後の相互作用学習モデルは、実際には合成可能な未知の組成式の前記スコアを低い値で算出してしまい、この組成式が合成不可能と判定されるという課題がある。 However, in Patent Document 1, the interaction learning model is learned by treating such a composition formula as negative example teacher data, which should actually be handled as positive example teacher data. Therefore, the interactive learning model after learning has a problem in that the score of an unknown composition formula that can actually be synthesized is calculated as a low value, and this composition formula is determined to be unsynthesizable.

本開示は、実際には合成可能な未知の組成式が、合成不可能と判断されることを防止し得る技術を提供する。 The present disclosure provides a technique that can prevent an unknown composition formula that is actually synthesizable from being determined to be unsynthesizable.

疑似組成式の生成モデルが学習過程で生成する疑似組成式は学習段階が進むほど合成可能性が高くなる。本構成は、生成モデルが学習過程の後半に生成した疑似組成式ほど高い値の教師ラベルを付与することができ、疑似組成式に対する教師ラベルの振り間違いを抑制できる。そして、このようにして教師ラベルが付与された疑似組成式を学習することで合成可能性を推定するための推定モデルが生成されている。そのため、この推定モデルは、任意の組成式について、実際の合成可能性が高くなるほど、合成可能性の値を高く算出することができる。したがって、本構成は、実際には合成可能な未知の組成式が合成不可能と判断されることを防止できる。 Generating Pseudo-Compositional Formula The possibility of synthesizing the pseudo-compositional formula generated by the model during the learning process increases as the learning stage progresses. With this configuration, a teacher label with a higher value can be assigned to a pseudo-composition formula generated later in the learning process by the generative model, and erroneous assigning of the teacher label to the pseudo-composition formula can be suppressed. An estimation model for estimating the synthesis possibility is generated by learning the pseudo-composition formula to which the teacher label is assigned in this way. Therefore, this estimation model can calculate a higher value of synthesizing possibility for an arbitrary composition formula as the actual synthesizing possibility increases. Therefore, this configuration can prevent an unknown composition formula that is actually synthesizable from being determined to be unsynthesizable.

上記構成において、更に、前記既知組成式リストに含まれる既知組成式のうち推定対象の材料と同じ種類の第１既知組成式を特定し、
予め定められた元素の族の組み合わせに対して、前記組み合わせが一致する前記第１既知組成式を投票し、投票結果を前記メモリに保持し、
前記疑似組成式のうち、前記組み合わせが、前記投票結果が未投票である前記組み合わせと一致する前記疑似組成式に対して、前記合成可能性が０であることを示す前記教師ラベルを付与してもよい。 In the above configuration, further identifying a first known composition formula of the same type as the material to be estimated from among the known composition formulas included in the known composition formula list,
voting the first known compositional formula that matches the combination for a combination of a predetermined group of elements, holding the voting result in the memory;
assigning the supervised label indicating that the synthesizing possibility is 0 to the pseudo-composition formula whose combination matches the combination for which the voting result has not yet been voted, among the pseudo-composition formulas; good too.

本構成によれば、メモリに記憶された疑似組成式のうち、推定対象の材料と同じ種類の第１既知組成式が持つ元素の族の組み合わせとは異なる元素の族の組み合わせを持つ疑似組成式には合成可能性が０であることを示す教師ラベルが付与される。そのため、推定対象の材料とは関連性の低い元素の族の組み合わせを持つ疑似組成式に対して０の教師ラベルを付与して、推定モデルを学習させることができる。その結果、推定対象の材料に関して、推定精度の高い推定モデルを生成できる。 According to this configuration, among the pseudo-composition formulas stored in the memory, the pseudo-composition formula having a combination of element groups different from the combination of element groups possessed by the first known composition formula of the same type as the material to be estimated is given a supervised label indicating that the combinability is 0. Therefore, the estimation model can be learned by assigning a teacher label of 0 to the pseudo-composition formula having a combination of groups of elements with low relevance to the material to be estimated. As a result, an estimation model with high estimation accuracy can be generated for the material to be estimated.

上記構成において、前記既知組成式リストは、更に、既知組成式の形成エネルギーを含み、
前記既知組成式リストを用いて学習されたエネルギー推定モデルと、前記既知組成式リストに含まれる形成エネルギーの平均と分散とを前記メモリから取得し、
前記疑似組成式の形成エネルギーを前記エネルギー推定モデルを用いて推定し、前記平均と分散とを用いて前記既知組成式の前記形成エネルギーに対する前記疑似組成式の形成エネルギーの距離を算出し、前記距離が増大するにつれて前記教師ラベルの値が小さくなるように前記教師ラベルを補正してもよい。 In the above configuration, the known composition formula list further includes formation energies of known composition formulas,
obtaining from the memory the energy estimation model learned using the known composition formula list and the average and variance of formation energies included in the known composition formula list;
The formation energy of the pseudo composition formula is estimated using the energy estimation model, the distance of the formation energy of the pseudo composition formula to the formation energy of the known composition formula is calculated using the average and the variance, and the distance The teacher label may be corrected such that the value of the teacher label becomes smaller as .

組成式は形成エネルギーが高すぎると不安定になり逆に低すぎても不安定となる。本構成によれば、教師ラベルが付与された疑似組成式について、エネルギー推定モデルを用いて形成エネルギーが推定され、推定された形成エネルギーと既知組成式の形成エネルギーとの距離が増大するほど教師ラベルの値が小さく補正される。そして、教師ラベルが補正された疑似組成式を用いて推定モデルが学習されるため、形成エネルギーという物理的特性を考慮に入れて推定モデルを学習させることができ、推定モデルの推定精度を高めることができる。 The compositional formula becomes unstable when the formation energy is too high, and conversely becomes unstable when it is too low. According to this configuration, the formation energy is estimated using the energy estimation model for the pseudo-composition formula to which the teacher label is assigned, and the teacher label becomes larger as the distance between the estimated formation energy and the formation energy of the known composition formula increases. is corrected to be smaller. Since the estimation model is learned using the pseudo-composition formula with corrected teacher labels, the estimation model can be learned by taking into account the physical characteristic of formation energy, and the estimation accuracy of the estimation model can be improved. can be done.

上記構成において、前記メモリは、更に、前記疑似組成式に対する特性値を出力する特性値推定モデルを保持し、
ユーザによって入力された特性値範囲を取得し、
前記特性値範囲内の特性値を前記生成モデルに入力して前記疑似組成式を生成し、
生成した前記疑似組成式の特性値を特性値推定モデルを用いて推定し、
前記合成可能性を示す第１軸と、前記特性値を示す第２軸との少なくとも２軸を備えるグラフ上に、生成した前記疑似組成式について前記合成可能性と前記特性値の推定値とがプロットされた散布図をディスプレイに表示してもよい。 In the above configuration, the memory further holds a characteristic value estimation model that outputs characteristic values for the pseudo composition formula,
get the characteristic value range entered by the user;
inputting a characteristic value within the characteristic value range into the generative model to generate the pseudo composition formula;
estimating the characteristic values of the generated pseudo composition formula using a characteristic value estimation model;
On a graph having at least two axes, a first axis indicating the synthesizability and a second axis indicating the characteristic value, the synthesizability and the estimated value of the characteristic value for the generated pseudo-composition formula are A plotted scatterplot may be shown on the display.

本構成によれば、ユーザにより特性値範囲が入力されると、その特性値範囲内の特性値を持つ疑似組成式が生成モデルによって生成されると共に、この疑似組成式の特性値が特性値推定モデルによって推定される。そして、この疑似組成式の合成可能性と特性値の推定値とがグラフ上にプロットされた散布図がディスプレイに表示される。そのため、ユーザは、特性値範囲を入力することにより、その特性値範囲内に特性値を持つことが推定される多数の疑似組成式について合成可能性と特性値との関係を容易に把握することができる。 According to this configuration, when a characteristic value range is input by a user, a pseudo composition formula having characteristic values within the characteristic value range is generated by the generative model, and the characteristic values of the pseudo composition formula are used for characteristic value estimation. estimated by the model. Then, a scatter diagram in which the synthesizability of the pseudo-composition formula and the estimated values of the characteristic values are plotted on a graph is displayed on the display. Therefore, by inputting the characteristic value range, the user can easily grasp the relationship between the synthesis possibility and the characteristic value for a large number of pseudo-composition formulas that are presumed to have characteristic values within the characteristic value range. can be done.

上記構成において、前記既知組成式リストは、ＩＣＳＤ、ｍａｔｅｒｉａｌｓｐｒｏｊｅｃｔ、又はＮＯＭＡＤのデータベースで構成されていてもよい。 In the above configuration, the list of known composition formulas may be composed of a database of ICSD, materials project, or NOMAD.

本構成によれば、ＩＣＳＤ、ｍａｔｅｒｉａｌｓｐｒｏｊｅｃｔ、又はＮＯＭＡＤ等の既存のデータベースが既知組成式リストとして使用されるため、既知組成式リストを作成する手間を省くことができる。 According to this configuration, an existing database such as ICSD, materials project, or NOMAD is used as the list of known composition formulas, thus saving the trouble of creating the list of known composition formulas.

上記構成において、前記メモリは、更に、前記生成モデルが生成した前記疑似組成式が本物であるか偽物であるかを識別する識別モデルを保持し、
前記生成モデル及び前記識別モデルは、敵対的学習アルゴリズムに基づいて学習されたニューラルネットワークで構成されていてもよい。 In the above configuration, the memory further holds an identification model that identifies whether the pseudo compositional formula generated by the generative model is genuine or fake,
The generative model and the discriminative model may consist of neural networks trained based on adversarial learning algorithms.

本構成によれば、識別モデルと生成モデルとが敵対的学習アルゴリズムに基づいて学習されたニューラルネットワークで構成されているので、生成モデルはより合成可能性の高い疑似組成式を生成することができる。 According to this configuration, since the discriminative model and the generative model are composed of neural networks trained based on the adversarial learning algorithm, the generative model can generate a pseudo-composition formula with higher synthesizability. .

上記構成において、前記教師ラベルの付与では、前記学習過程の初期に生成された前記疑似組成式と学習終了時に生成された前記疑似組成式とに対してのみ教師ラベルを付与して前記メモリに保持し、
前記学習段階の初期に生成された前記疑似組成式は、前記合成可能性が０であることを示す数値が付与され、
前記学習終了時に生成された前記疑似組成式は、前記合成可能性が１であることを示す数値が付与されてもよい。 In the above configuration, in assigning the teacher label, the teacher label is assigned only to the pseudo composition formula generated at the beginning of the learning process and the pseudo composition formula generated at the end of the learning process, and stored in the memory. death,
The pseudo-composition formula generated at the beginning of the learning stage is given a numerical value indicating that the synthesis possibility is 0,
A numerical value indicating that the synthesis possibility is 1 may be assigned to the pseudo compositional formula generated at the end of the learning.

本構成によれば、学習過程の初期に生成された疑似組成式と学習終了時に生成された疑似組成式とに対してのみ教師ラベルが付与されてメモリに保持されるので、メモリの空き容量を確保できる。また、教師ラベルとして０、１の数値のみが使用されているので、データ構造を簡略化し、メモリの空き容量を確保できる。 According to this configuration, only the pseudo-composition formulas generated at the beginning of the learning process and the pseudo-composition formulas generated at the end of the learning process are given teacher labels and stored in the memory, so that the free space of the memory can be reduced. can be secured. In addition, since only numerical values 0 and 1 are used as teacher labels, the data structure can be simplified and free memory space can be secured.

上記構成において、前記教師ラベルは、前記生成モデルの総学習回数に対する前記学習段階を示す数値の割合が付与されてもよい。 In the above configuration, the teacher label may be given a ratio of a numerical value indicating the learning stage to the total number of learning times of the generative model.

本構成によれば、教師ラベルを０～１の範囲内の数値で表すことができる。 According to this configuration, the teacher label can be represented by a numerical value within the range of 0-1.

（実施の形態１）
実施の形態１に係る材料情報出力システムは、既知組成式リストに含まれる既知組成式によりニューラルネットワークによって構成される生成モデルを学習させ、学習時に生成モデルから出力される疑似組成式に対し、一定の基準に従って合成可能性を算出し、算出した合成可能性を教師ラベルとして付与する。そして、教師ラベルと疑似組成式とのペアを用いて、ニューラルネットワークによって構成される推定モデルを学習させる。そして、学習された推定モデルに未知の組成式を入力させ、その組成式の合成可能性を出力させる。 (Embodiment 1)
The material information output system according to Embodiment 1 trains a generative model constructed by a neural network based on the known composition formula included in the known composition formula list, and the pseudo composition formula output from the generative model during learning has a constant and assigns the calculated synthesizing possibility as a teacher label. Then, using pairs of teacher labels and pseudo-composition formulas, an estimation model configured by a neural network is learned. Then, an unknown compositional formula is input to the learned estimation model, and the synthesizability of the compositional formula is output.

図１は、実施の形態１に係る材料情報出力システムの構成の一例を示すブロック図である。材料情報出力システムは、材料情報出力装置１００と、既知組成式リスト１０１と、生成モデル学習部１０２と、入力部１０３と、記述子算出部１０８とを備えている。材料情報出力装置１００は、疑似組成式保持部１０４と、教師ラベル付与部１０５と、推定モデル学習部１０６と、推定モデル保持部１０７と、合成可能性算出部１０９と、表示部１１０とを備える。材料情報出力システムの各ブロックは、例えば、画像プロセッサ及びマイクロプロセッサ等のプロセッサが所定のプログラムを実行することにより実現されるソフトウェアで構成されてもよい。具体的には、図１において、四角形で示すブロックは、特に断りがなければ、プロセッサで構成され、円筒形で示すブロックは、半導体メモリ等のメモリで構成される。 FIG. 1 is a block diagram showing an example of the configuration of a material information output system according to Embodiment 1. FIG. The material information output system includes a material information output device 100 , a known composition formula list 101 , a generative model learning section 102 , an input section 103 and a descriptor calculation section 108 . The material information output device 100 includes a pseudo composition formula storage unit 104, a teacher label assignment unit 105, an estimated model learning unit 106, an estimated model storage unit 107, a synthesis possibility calculation unit 109, and a display unit 110. . Each block of the material information output system may be composed of software implemented by a processor such as an image processor and a microprocessor executing a predetermined program. Specifically, in FIG. 1, unless otherwise specified, rectangular blocks are composed of processors, and cylindrical blocks are composed of memories such as semiconductor memories.

既知組成式リスト１０１は、ＩＣＳＤ、ｍａｔｅｒｉａｌｓｐｒｏｊｅｃｔ、及びＮＯＭＡＤ等の既知組成式を記憶する既存のデータベース（ＤＢ）で構成されている。既知組成式は、材料研究者によってその存在が確認された組成式である。既知組成式リスト１０１には材料研究者自らが実験を行って存在が確認された組成式が含まれていてもよい。 The known composition formula list 101 is composed of existing databases (DB) that store known composition formulas such as ICSD, materials project, and NOMAD. A known compositional formula is a compositional formula whose existence has been confirmed by materials researchers. The known composition formula list 101 may include composition formulas whose existence has been confirmed by the materials researcher's own experiments.

図２は、既知組成式リスト１０１のデータ構成の一例を示す図である。既知組成式リスト１０１は、複数の既知組成式のそれぞれについて「組成式」、「特性値」、「特徴量」、及び「記述子」を対応付けて記憶する。「組成式」は、Ｌｉ_０－１ＴｉＯ_２等の組成式を示す。「特性値」は、電圧及びバッテリー容量等の既知組成式の特性を示すパラメータである。この例では、「特性値」として電圧及びバッテリー容量等の電池に関するパラメータが採用されているが、これは一例であり、電池以外の材料に関するパラメータが採用されてもよい。 FIG. 2 is a diagram showing an example of the data structure of the known composition formula list 101. As shown in FIG. The known composition formula list 101 stores a "composition formula", a "characteristic value", a "characteristic quantity", and a "descriptor" in association with each of a plurality of known composition formulas. "Compositional formula" indicates a compositional formula such as Li _0-1 TiO ₂ . A "characteristic value" is a parameter that indicates a characteristic of a known composition formula, such as voltage and battery capacity. In this example, battery-related parameters such as voltage and battery capacity are used as "characteristic values", but this is just an example, and parameters related to materials other than batteries may also be used.

「記述子」は、（ａ，ｂ，ｃ，α，β，γ）で示される材料の結晶構造を表すパラメータで構成される。結晶構造を示すパラメータとしては、例えば、各原子の座標、原子容積、共有結合半径、及び密度等の各元素固有の数値群が採用できる。また、結晶構造を示すパラメータとしては、「ＣａＭｎＯ_３」を例に挙げると、Ｃａの原子半径１９７とＭｎの原子半径１２７とＯの原子半径６０とについて「Ｃａ：Ｍｎ：Ｏ＝１：１：３」の重みを付け平均である「１００．８」という値が採用されてもよい。 A "descriptor" consists of parameters representing the crystal structure of a material denoted by (a, b, c, α, β, γ). As parameters indicating the crystal structure, for example, a group of numerical values unique to each element, such as the coordinates of each atom, atomic volume, covalent bond radius, and density, can be used. Further, as parameters indicating the crystal structure, taking “CaMnO ₃ ” as an example, “Ca:Mn:O=1:1: A value of '100.8', which is a weighted average of '3', may be taken.

「特徴量」は、例えば、図３に示すように周期表における元素の割合に基づいて算出される数値群（ベクトル）によって表現される。図３は、周期表に従って算出される特徴量の一例を示す図である。以下、ＬｉＣｏ（ＰＯ_３）_４で表される組成式３０１を例に挙げて特徴量３０４の算出について説明する。まず、この組成式３０１は、周期表３０２の元素を左上から右下へラスタースキャンすることで一次元のベクトル３０３に変換される。ベクトル３０３は、周期表の順番で元素が対応付けられた複数のベクトル要素で構成されている。そして、組成式３０１を構成する元素の元素数が、対応するベクトル要素に割り当てることでベクトル３０３が作成される。ＬｉＣｏ（ＰＯ_３）_４はＬｉ、Ｃｏ、Ｏ、Ｐの元素からなり、それぞれの元素数は、「１」、「１」、「１２」、「４」であるため、ベクトル３０３は、（０，０，１，・・・，１２，・・・，４，・・・，１，０・・・）で表される。次に、ベクトル３０３は、ベクトル要素の総和が１となるように正規化され、特徴量３０４が算出される。 The "characteristic amount" is represented by a numerical value group (vector) calculated based on the ratio of elements in the periodic table, as shown in FIG. 3, for example. FIG. 3 is a diagram showing an example of feature amounts calculated according to the periodic table. Calculation of the feature quantity 304 will be described below by taking the composition formula 301 represented by LiCo(PO ₃ ) ₄ as an example. First, this compositional formula 301 is converted into a one-dimensional vector 303 by raster scanning the elements of the periodic table 302 from the upper left to the lower right. A vector 303 is composed of a plurality of vector elements to which elements are associated in the order of the periodic table. A vector 303 is created by assigning the number of elements constituting the composition formula 301 to the corresponding vector elements. LiCo(PO ₃ ) ₄ is composed of elements Li, Co, O, and P, and the number of elements is “1,” “1,” “12,” and “4,” so vector 303 is (0 , 0, 1, . . . , 12, . Next, the vector 303 is normalized so that the sum of the vector elements becomes 1, and the feature quantity 304 is calculated.

なお、これら特徴量及び記述子は、実際の組成式への変換が容易な値となっているため、特徴量及び記述子は、組成式と一対一対応する。したがって、特徴量及び記述子の少なくとも一方が分かれば組成式を一意的に特定できる。そこで、以下の説明において、疑似組成式を生成することは、特徴量及び記述子の少なくとも一方を生成することと同じであるものとする。 Note that these feature amounts and descriptors are values that can be easily converted into actual composition formulas, so the feature amounts and descriptors have a one-to-one correspondence with the composition formula. Therefore, if at least one of the feature quantity and the descriptor is known, the compositional formula can be uniquely specified. Therefore, in the following description, generating a pseudo-composition formula is the same as generating at least one of a feature amount and a descriptor.

図１に参照を戻す。生成モデル学習部１０２は、上記の既知組成式リスト１０１に含まれる既知組成式を用いて組成式を生成するための生成モデルを学習する。生成モデルから出力される組成式には、既知組成式リスト１０１には含まれない組成式も含まれる。本実施の形態では、学習時に付随的に生成される組成式を疑似組成式と呼び、それを用いて材料の合成可能性を推定するための推定モデルを構築する。 Refer back to FIG. The generative model learning unit 102 learns a generative model for generating a composition formula using the known composition formulas included in the known composition formula list 101 . Composition formulas output from the generative model include composition formulas that are not included in the known composition formula list 101 . In the present embodiment, a compositional formula that is incidentally generated during learning is called a pseudo-compositional formula, and is used to build an estimation model for estimating the possibility of synthesizing a material.

以下、生成モデル学習部１０２を構成するニューラルネットワークについて説明する。図４は、ニューラルネットワークが行う演算の計算モデルを説明するための概念図である。ニューラルネットワークは、周知のように、生物のニューラルネットワークを模した計算モデルに従って演算を行う演算装置である。 A neural network that constitutes the generative model learning unit 102 will be described below. FIG. 4 is a conceptual diagram for explaining a calculation model of operations performed by the neural network. A neural network, as is well known, is a computing device that performs computation according to a computational model imitating a biological neural network.

図４に示されるように、ニューラルネットワーク４００は、ニューロンに相当する複数のユニット（白丸で示されている）を、入力層４０１、隠れ層４０２、及び出力層４０３のそれぞれに配置して構成される。 As shown in FIG. 4, the neural network 400 is configured by arranging a plurality of units (indicated by white circles) corresponding to neurons in an input layer 401, a hidden layer 402, and an output layer 403, respectively. be.

隠れ層４０２は、図４に示す例では、２つの隠れ層４０２ａ及び隠れ層４０２ｂで構成されているが、単一の隠れ層若しくは３つ以上の隠れ層で構成されてもよい。なお、複数の隠れ層を有するニューラルネットワークは、特に、多層ニューラルネットワーク装置と呼ばれることがある。 The hidden layer 402 is composed of two hidden layers 402a and 402b in the example shown in FIG. 4, but may be composed of a single hidden layer or three or more hidden layers. A neural network having a plurality of hidden layers is sometimes called a multi-layer neural network device.

ここで、入力層４０１に近い層を下位層とし、出力層４０３に近い層を上位層とする。この場合、各ユニットは、下位層に配置されたユニットから受信した計算結果を荷重値に応じて結合（例えば、荷重和演算）し、該当結合の結果を上位層に配置されたユニットに送信する計算要素である。 Here, a layer close to the input layer 401 is called a lower layer, and a layer close to the output layer 403 is called an upper layer. In this case, each unit combines the calculation results received from the units arranged in the lower layer according to the weight value (for example, a weighted sum operation), and transmits the result of the combination to the unit arranged in the upper layer. It is a computational element.

ニューラルネットワーク４００の機能は、ニューラルネットワーク４００が有する層の数及び各層に配置されるユニットの数を表す構成情報と、各ユニットでの荷重和演算に用いられる荷重値を表す荷重Ｗ＝［ｗ１、ｗ２、・・・］とで定義される。 The functions of the neural network 400 include configuration information representing the number of layers and the number of units arranged in each layer of the neural network 400, and weight W=[w1, w2, . . . ].

図４に示すように、ニューラルネットワーク４００は、入力層４０１の各入力ユニット４０５に入力データＸ＝［ｘ１、ｘ２、・・・］（画像であれば画素値）が入力される。これにより、隠れ層４０２の隠れユニット４０６及び出力層４０３の出力ユニット４０７において荷重Ｗ＝［ｗ１、ｗ２、・・・］を用いた荷重和演算が行われ、出力層４０３の各出力ユニット４０７の出力値から出力データＹ＝［ｙ１、ｙ２、・・・］が出力される。 As shown in FIG. 4, the neural network 400 receives input data X=[x1, x2, . As a result, a weighted sum operation using weights W=[w1, w2, . Output data Y=[y1, y2, . . . ] is output from the output values.

入力データＸがニューラルネットワーク４００に入力された際に所望の出力データＹを出力するように、ニューラルネットワーク４００の荷重Ｗの値を調整することをニューラルネットワークの学習と呼ぶ。ニューラルネットワーク４００を学習する場合、例えば、入力データＸ及び出力データＹとの誤差を表す損失関数を定義し、勾配降下法により当該損失関数を減少させる勾配に沿って荷重Ｗは更新される。 Neural network learning refers to adjusting the value of the weight W of the neural network 400 so that the desired output data Y is output when the input data X is input to the neural network 400 . When learning the neural network 400, for example, a loss function representing the error between the input data X and the output data Y is defined, and the weight W is updated along the gradient that decreases the loss function by the gradient descent method.

ニューラルネットワークを含む機械学習の問題は、出力データの種類に応じて識別問題、回帰問題、及び生成問題の３つの問題に分けられる。まず、識別問題について説明する。図５は、ニューラルネットワークが識別問題を学習する様子を説明する図である。識別問題は与えられた画像が「犬」か「猫」かを判定するようにニューラルネットワークの荷重Ｗの値を求める問題のことを指す。 Problems in machine learning including neural networks are divided into three problems, identification problems, regression problems, and generation problems, depending on the type of output data. First, the identification problem will be explained. FIG. 5 is a diagram explaining how the neural network learns the classification problem. The identification problem refers to the problem of obtaining the value of the weight W of the neural network so as to determine whether a given image is a "dog" or a "cat".

より具体的には、ニューラルネットワーク４００では、各出力ユニットは入力データＸを分類するための異なる正解ラベルに対応付けられている。また、荷重Ｗは複数の入力データＸの各々が入力されたときに、当該入力データＸの正しい正解ラベルに対応する出力ユニットの出力値が１に近づき、他の出力ユニットの出力値が０に近づくように調整される。 More specifically, in neural network 400, each output unit is associated with a different correct label for classifying input data X. FIG. Also, the load W is such that when each of a plurality of input data X is input, the output value of the output unit corresponding to the correct correct label of the input data X approaches 1, and the output value of the other output units becomes 0. adjusted to come closer.

つまり、図５に示される例では、ニューラルネットワーク４００において、各出力ユニットは、例えば、「犬」であれば正解ラベルが「１、０」、「猫」であれば正解ラベルが「０、１」というように、「犬」及び「猫」を示す正解ラベルのうち異なる１つの正解ラベルが対応付けられる。また、荷重Ｗは、「犬」の正解ラベルが付与された学習データが入力されると出力データ［１、０］を出力し、「猫」が付与された学習データが入力されると出力データ［０、１］を出力するように調整される。 That is, in the example shown in FIG. 5, in the neural network 400, for example, each output unit has a correct label of "1, 0" for "dog" and a correct label of "0, 1" for "cat". , one different correct label out of the correct labels indicating "dog" and "cat" is associated. In addition, the load W outputs output data [1, 0] when learning data with the correct label of "dog" is input, and outputs data [1, 0] when learning data with the correct label of "cat" is input. Adjusted to output [0,1].

なお、ニューラルネットワーク４００が多層ニューラルネットワークである場合、特に、前記教師付き学習を行う前に、「ｌａｙｅｒ－ｗｉｓｅｐｒｅ－ｔｒａｉｎｉｎｇ」と呼ばれる学習手法によって、荷重値は隠れ層ごとに個別に調整されても良い。これにより、その後の学習において、汎化性能が高い荷重Ｗが得られる。 In addition, when the neural network 400 is a multi-layer neural network, the weight values are individually adjusted for each hidden layer by a learning method called “layer-wise pre-training” especially before performing the supervised learning. Also good. As a result, a load W with high generalization performance is obtained in subsequent learning.

また、ニューラルネットワーク４００の荷重値の調整には、上述した勾配降下法の他にも、例えば、「Ａｄａｍ」などの周知のアルゴリズムが用いらてもよい。また、ニューラルネットワーク４００が多層ニューラルネットワークである場合、各層で個別に学習が実行されても良い。 In addition to the gradient descent method described above, for example, a known algorithm such as “Adam” may be used to adjust the weight values of the neural network 400 . Also, if the neural network 400 is a multilayer neural network, each layer may be trained separately.

次に、回帰問題について説明する。図６は、ニューラルネットワークが回帰問題を学習する様子を説明する図である。回帰問題は「犬」及び「猫」といった正解ラベルとは異なり、例えば、材料の「特性値」のような実数値が推定されるように、ニューラルネットワークの荷重Ｗの値を求める問題のことを指す。 Next, the regression problem will be explained. FIG. 6 is a diagram explaining how a neural network learns a regression problem. The regression problem is different from the correct labels such as "dog" and "cat", and is the problem of obtaining the value of the weight W of the neural network so that the real value such as the "characteristic value" of the material is estimated. Point.

より具体的には、ニューラルネットワーク４００では、出力ユニットのそれぞれには、出力データＹを構成する実数値ｙ１，ｙ２が対応付けられ、入力ユニットのそれぞれには、入力データＸを構成する実数値ｘ１，ｘ２が対応付けられている。荷重Ｗは、複数の入力データＸの各々が入力されたときに、当該入力データＸに対応する正しい実数値が各出力ユニットから出力されるように調整される。 More specifically, in the neural network 400, real values y1 and y2 forming the output data Y are associated with each of the output units, and real values x1 forming the input data X are associated with each of the input units. , x2 are associated. The load W is adjusted so that when each of the plurality of input data X is input, a correct real value corresponding to the input data X is output from each output unit.

図６の例では、ニューラルネットワーク４００において、出力データＹは、実数値ｙ１が例えば電圧、実数値ｙ２が例えば容量というように、材料の特性値で構成されている。また、入力データＸは、ｘ１が例えばパラメータα、ｘ２がパラメータβ、ｘ３が特徴量というように、材料の「特徴量及び記述子」で構成されている。ここでは、「特徴量及び記述子」と記載したが、これらは少なくとも一方が用いられてもよい。記述子は、図２で説明した材料の結晶構造を示すパラメータである。特徴量は、図３に示すように元素の割合によって算出される一次元ベクトルである。なお、荷重Ｗは、ニューラルネットワーク４００から出力（推定）された特性値が正解特性値に近づくように調整される。 In the example of FIG. 6, in the neural network 400, the output data Y is composed of material characteristic values, such as the voltage for the real value y1 and the capacitance for the real value y2. Also, the input data X is composed of "feature amounts and descriptors" of the material such that x1 is the parameter α, x2 is the parameter β, and x3 is the feature amount. Here, "feature quantity and descriptor" are described, but at least one of these may be used. A descriptor is a parameter that indicates the crystal structure of the material described in FIG. The feature quantity is a one-dimensional vector calculated from the ratio of elements as shown in FIG. The weight W is adjusted so that the characteristic value output (estimated) from the neural network 400 approaches the correct characteristic value.

次に、生成問題について説明する。図７は、ニューラルネットワークが生成問題を学習する様子を説明する図である。生成問題は、上記の識別問題又は回帰問題とは異なり、既知ＤＢに含まれない何らかのデータを生成する問題である。例えば、ある特性値を持つ組成式を複数生成するというように、生成問題は回帰問題の逆問題に対応する。加えて、識別問題及び回帰問題は入力と出力とが１対１対応となっている必要があるが、生成問題は１つの入力データＸに対して複数の出力データＹが出力されてもよい。これは、ある特性値を満たす組成式は複数存在しうるからである。図１に示す生成モデル学習部１０２は、この生成問題を解くニューラルネットワークを学習する。 Next, the generation problem will be explained. FIG. 7 is a diagram explaining how the neural network learns the generation problem. A generation problem is a problem that generates some data that is not included in the known DB, unlike the identification problem or regression problem described above. For example, a generation problem corresponds to an inverse problem of a regression problem, such as generating multiple composition formulas with a certain characteristic value. In addition, while identification problems and regression problems require a one-to-one correspondence between inputs and outputs, a plurality of output data Y may be output for one input data X in a generation problem. This is because there can be a plurality of composition formulas that satisfy certain characteristic values. The generative model learning unit 102 shown in FIG. 1 learns a neural network that solves this generative problem.

より具体的には、図７に示すニューラルネットワーク４００では、出力ユニットのそれぞれには出力データＹを構成する実数値ｙ１，ｙ２が対応付けられ、入力ユニットのそれぞれには入力データＸを構成する実数値ｘ１，ｘ２，・・・対応付けられている。また、荷重Ｗは、複数の入力データＸの各々が入力されたときに、複数の入力データＸのそれぞれに対応する正しい実数値が出力されるように調整される。例えば、組成式「Ｌｉ_０－１ＴｉＯ_２」の特性値である電圧「９．４３」及び容量「１２１４」が入力データＸ［ｘ１，ｘ２］として入力されると、組成式「Ｌｉ_０－１ＴｉＯ_２」の「特徴量及び記述子」が出力されるように荷重Ｗが調整される。 More specifically, in the neural network 400 shown in FIG. 7, real numbers y1 and y2 forming the output data Y are associated with each of the output units, and real numbers y1 and y2 forming the input data X are associated with each of the input units. Numerical values x1, x2, . . . are associated with each other. Also, the load W is adjusted so that a correct real number corresponding to each of the plurality of input data X is output when each of the plurality of input data X is input. For example, when voltage “9.43” and capacity “1214”, which are characteristic values of the composition formula “Li _0-1 TiO ₂ ”, are input as input data X[x1, x2], the composition formula “Li _0-1 The weight W is adjusted so that the "feature quantity and descriptor" of " _TiO2 " is output.

このとき、入力データＸは、識別問題又は回帰問題における出力データＹに対応し、出力データＹは識別問題又は回帰問題における入力データＸに対応する。 At this time, the input data X corresponds to the output data Y in the identification problem or regression problem, and the output data Y corresponds to the input data X in the identification problem or regression problem.

つまり、図７に示すニューラルネットワーク４００において、出力データＹは、ｙ１が例えば組成式「Ｌｉ_０－１ＴｉＯ_２」の記述子のパラメータα、ｙ２が例えば組成式「Ｌｉ_０－１ＴｉＯ_２」の記述子のパラメータβというように、材料の特徴量及び記述子で構成されている。また、入力データＸは、ｘ１が例えば組成式「Ｌｉ_０－１ＴｉＯ_２」の電圧、ｘ２が例えば組成式「Ｌｉ_０－１ＴｉＯ_２」の容量というように、材料の特性値で構成されている。 That is, in the neural network 400 shown in FIG. 7, the output data Y is such that y1 is the parameter α of the composition formula “Li _0-1 TiO ₂ ”, and y2 is the composition formula “Li _0-1 TiO ₂ ”. The parameter β of the descriptor is composed of the feature amount of the material and the descriptor. The input data X is composed of material characteristic values such that x1 is, for example, the voltage of the composition formula “Li _0-1 TiO ₂ ” and x2 is, for example, the capacity of the composition formula “Li _0-1 TiO ₂ ”. there is

そして、荷重Ｗは、ニューラルネットワーク４００から出力（推定）された「特徴量及び記述子」が正解に近づくように調整される。 Then, the weight W is adjusted so that the "feature quantity and descriptor" output (estimated) from the neural network 400 approaches the correct answer.

前述のように、生成モデル学習部１０２において、ニューラルネットワーク４００は、入力データＸから出力データＹを生成する生成問題を解く。この生成問題は、例えば、敵対的学習アルゴリズムによって導出される。敵対的学習アルゴリズムは、「ｇｅｎｅｒａｔｉｖｅａｄｖｅｒｓａｒｉａｌｎｅｔｗｏｒｋ（ＧＡＮ）」とも称され、近年盛んに研究が行われている。 As described above, in the generative model learning unit 102, the neural network 400 solves the generative problem of generating the output data Y from the input data X. FIG. This generation problem is derived, for example, by an adversarial learning algorithm. Adversarial learning algorithms are also called “generative adversarial networks (GANs)” and have been extensively studied in recent years.

以下、組成式の生成を題材として、材料応用へ適している、ＧＡＮの一種であるｃｏｎｄｉｔｉｏｎａｌ－ＧＡＮ（Ｃ－ＧＡＮ）について説明する。Ｃ－ＧＡＮは、生成部と識別部とのそれぞれを構成する２種類のニューラルネットワークを利用して生成部のニューラルネットワークを学習させるアルゴリズムである。図８は、図１に示す生成モデル学習部１０２の詳細な構成の一例を示す図である。 In the following, conditional-GAN (C-GAN), which is a type of GAN suitable for material application, will be described with reference to the generation of compositional formulas. C-GAN is an algorithm that trains the neural network of the generator by using two types of neural networks that constitute the generator and the discriminator. FIG. 8 is a diagram showing an example of a detailed configuration of the generative model learning unit 102 shown in FIG. 1. As shown in FIG.

生成モデル学習部１０２は、乱数発生部８０１、生成部８０２、及び識別部８０３を備える。乱数発生部８０１は、所望数の乱数を発生させる。この乱数の発生数が生成部８０２にて生成される疑似組成式の数となる。例えば、乱数の発生数が１０００個の場合、１０００個の疑似組成式が生成される。 Generative model learning unit 102 includes random number generating unit 801 , generating unit 802 , and identifying unit 803 . A random number generator 801 generates a desired number of random numbers. The number of random numbers generated is the number of pseudo composition formulas generated by the generator 802 . For example, if the number of random numbers generated is 1000, 1000 pseudo composition formulas are generated.

次に、生成部８０２は、各乱数と既知組成式リスト１０１に含まれる既知組成式の各特性値とのペアをニューラルネットワークＮＮ１に入力する。すなわち、ニューラルネットワークＮＮ１の入力データＸはＸ＝［乱数、特性値］となる。この入力データＸが与えられたとき、ニューラルネットワークＮＮ１は荷重Ｗの値に従って出力データＹを算出する。この出力データＹが疑似組成式に対応する。ここで、疑似組成式とは、図２に示す記述子、図３に示す特徴量、又はこれらの組み合わせに対応し、元の組成式への変換が容易な量を指す。 Next, the generation unit 802 inputs pairs of each random number and each characteristic value of the known composition formula included in the known composition formula list 101 to the neural network NN1. That is, the input data X of the neural network NN1 is X=[random number, characteristic value]. When this input data X is given, the neural network NN1 calculates output data Y according to the value of the load W. FIG. This output data Y corresponds to the pseudo composition formula. Here, the pseudo compositional formula refers to a quantity that corresponds to the descriptor shown in FIG. 2, the feature quantity shown in FIG. 3, or a combination thereof, and that can be easily converted to the original compositional formula.

識別部８０３は、生成部８０２で生成された疑似組成式の「特徴量及び記述子」が、実際の組成式の「特徴量及び記述子」と近しい値を生成できているかを判定する。この判定を行うために、識別部８０３を構成するニューラルネットワークＮＮ２は、図９に示すように、分類問題及び回帰問題の両問題を解く。 The identification unit 803 determines whether the “feature quantity and descriptor” of the pseudo composition formula generated by the generation unit 802 can generate values close to the “feature quantity and descriptor” of the actual composition formula. In order to make this determination, the neural network NN2 that constitutes the identification unit 803 solves both the classification problem and the regression problem, as shown in FIG.

図９は、識別部８０３を構成するニューラルネットワークＮＮ２が学習する様子を説明する図である。図９において、ニューラルネットワーク４００は、ニューラルネットワークＮＮ２に対応する。ニューラルネットワーク４００には、生成部８０２が生成した疑似組成式の「特徴量及び記述子」が入力データＸとして入力される。また、ニューラルネットワーク４００には、既知組成式リスト１０１に含まれる既知組成式の「特徴量及び記述子」が入力データＸとして入力される。 FIG. 9 is a diagram for explaining how the neural network NN2 forming the identification unit 803 learns. In FIG. 9, neural network 400 corresponds to neural network NN2. The “characteristic quantity and descriptor” of the pseudo-composition formula generated by the generator 802 is input as input data X to the neural network 400 . In addition, “characteristic amounts and descriptors” of known composition formulas included in the known composition formula list 101 are input as input data X to the neural network 400 .

分類問題においては、ニューラルネットワーク４００は、既知組成式リスト１０１に含まれる既知組成式の「特徴量及び記述子」を「本物」と判別し、疑似組成式の「特徴量及び記述子」を「偽物」と判別する二値判別を実施する。すなわち、ニューラルネットワーク４００は、本物ラベルを持つ「特徴量及び記述子」が入力された場合、出力データＹ＝［１、０］を出力し、偽物ラベルを持つ「特徴量及び記述子」が入力された場合、出力データＹ＝［０、１］を出力するように荷重Ｗの値を調整する。ここで、［１、０］は本物ラベルに対応する出力データＹを表し、［０、１］は偽物ラベルに対応する出力データＹを表す。 In the classification problem, the neural network 400 determines the "feature amounts and descriptors" of the known composition formulas included in the known composition formula list 101 as "genuine", and the "feature amounts and descriptors" of the pseudo-composition formulas as " Binary discrimination to discriminate as "fake" is performed. That is, the neural network 400 outputs output data Y=[1, 0] when "features and descriptors" with genuine labels are input, and "features and descriptors" with fake labels are input. If so, the value of the load W is adjusted so as to output the output data Y=[0, 1]. Here, [1, 0] represents the output data Y corresponding to the genuine label, and [0, 1] represents the output data Y corresponding to the fake label.

一方、回帰問題においては、ニューラルネットワーク４００は、既知組成式の「特徴量及び記述子」が入力された場合、特性値を正しく推定できるように荷重Ｗの値を調整する。すなわち、識別部８０３を構成するニューラルネットワークＮＮ２は、「二値判別＋特性値推定」の両方が正しく行えるように荷重Ｗを調整する。 On the other hand, in the regression problem, the neural network 400 adjusts the value of the weight W so that the characteristic value can be estimated correctly when the "feature quantity and descriptor" of the known composition formula are input. That is, the neural network NN2 that constitutes the identification unit 803 adjusts the weight W so that both "binary discrimination + characteristic value estimation" can be performed correctly.

図８に参照を戻す。識別部８０３での出力データ（Ｙ［本物、偽物］、Ｙ［特性値］）は、生成部８０２へ入力され、生成部８０２を構成するニューラルネットワークＮＮ１の荷重Ｗの調整にも利用される。生成部８０２は、識別部８０３において疑似組成式が「本物」と間違われるようにニューラルネットワークＮＮ１の荷重Ｗを調整する。 Refer back to FIG. Output data (Y [genuine, counterfeit], Y [characteristic value]) from the identification unit 803 is input to the generation unit 802 and used to adjust the weight W of the neural network NN1 that constitutes the generation unit 802 . The generation unit 802 adjusts the weight W of the neural network NN1 so that the identification unit 803 may mistake the pseudo composition formula for "genuine".

出力データ（Ｙ［本物、偽物］、Ｙ［特性値］）が入力されると、生成部８０２は、出力データＹ［本物、偽物］と本物ラベルに対応する出力データＹ［１，０］との距離Ｄを算出する。例えば、出力データＹ［本物、偽物］としてＹ＝［０．８、０．２］が入力された場合、生成部８０２は、距離ＤとしてＤ＝ｓｑｒｔ（（１－０．８）×（１－０．８）＋（０－０．２）×（０－０．２））＝０．３を算出する。一方、出力データＹ［本物、偽物］として、Ｙ＝［０．２、０．８］が入力された場合、生成部８０２は、距離ＤとしてＤ＝ｓｑｒｔ（（１－０．２）×（１－０．２）＋（０－０．８）×（０－０．８））＝１．１を算出する。この場合、出力データＹ［０．８、０．２］の方が出力データＹ［０．２、０．８］よりも距離Ｄが小さいため、本物に近いことが分かる。そのため、出力データＹ［０．８、０．２］に対応する疑似組成式の方が、出力データＹ［０．２、０．８］に対応する疑似組成式よりも、生成部８０２の生成結果として好ましい。そこで、生成部８０２は、距離Ｄが小さくなるように、ニューラルネットワークＮＮ１の荷重Ｗを調整する。 When the output data (Y [genuine, fake], Y [characteristic value]) are input, the generation unit 802 generates the output data Y [genuine, fake] and the output data Y [1, 0] corresponding to the genuine label. , the distance D is calculated. For example, when Y=[0.8, 0.2] is input as the output data Y [genuine, fake], the generating unit 802 generates D=sqrt((1−0.8)×(1 −0.8)+(0−0.2)×(0−0.2))=0.3. On the other hand, when Y=[0.2, 0.8] is input as the output data Y [genuine, fake], the generation unit 802 generates the distance D as D=sqrt((1−0.2)×( 1−0.2)+(0−0.8)×(0−0.8))=1.1. In this case, since the output data Y[0.8, 0.2] has a smaller distance D than the output data Y[0.2, 0.8], it is found to be closer to the real thing. Therefore, the pseudo-composition formula corresponding to the output data Y[0.8, 0.2] is higher than the pseudo-composition formula corresponding to the output data Y[0.2, 0.8]. It is preferable as a result. Therefore, the generator 802 adjusts the weight W of the neural network NN1 so that the distance D becomes smaller.

加えて、生成部８０２は、識別部８０３からの出力データＹ［特性値］も入力されている。この出力データＹ［特性値］は、生成部８０２が生成した疑似組成式に対する識別部８０３の特性値の推定結果を表している。識別部８０３からの出力データＹ［特性値］が、対応する疑似組成式の生成に用いられた特性値（既知組成式の特性値）と近い値になっているか否かも重要な情報である。すなわち、ある疑似組成式に対する識別部８０３の特性値の推定結果がその疑似組成式の生成に用いられた特性値に近いほど、生成部８０２はより本物に近い（実在する可能性の高い）疑似組成式を生成できていると言える。 In addition, the output data Y [characteristic value] from the identification unit 803 is also input to the generation unit 802 . This output data Y [characteristic value] represents the estimation result of the characteristic value of the identification unit 803 for the pseudo composition formula generated by the generation unit 802 . Whether or not the output data Y [characteristic value] from the identifying unit 803 is close to the characteristic value (the characteristic value of the known composition formula) used to generate the corresponding pseudo composition formula is also important information. That is, the closer the estimation result of the characteristic value of the identification unit 803 for a certain pseudo composition formula to the characteristic value used to generate the pseudo composition formula, the closer the generation unit 802 is to the real (high possibility of existing) pseudo It can be said that the composition formula has been generated.

そこで、生成部８０２は、識別部８０３からの出力データＹ［特性値］と、出力データＹ［特性値］に対応する疑似組成式を生成する際に用いた特性値との差（距離）が小さくなるように、ニューラルネットワークＮＮ１の荷重Ｗを調整する。このように、生成部８０２は、上記２種類の距離を用いてニューラルネットワークＮＮ１の荷重Ｗを調整する。 Therefore, the generation unit 802 determines that the difference (distance) between the output data Y [characteristic value] from the identification unit 803 and the characteristic value used when generating the pseudo composition formula corresponding to the output data Y [characteristic value] is The weight W of the neural network NN1 is adjusted so that it becomes smaller. Thus, the generator 802 adjusts the weight W of the neural network NN1 using the two types of distances.

これによって、生成部８０２を構成するニューラルネットワークＮＮ１は、入力された特性値を持つような疑似組成式を生成することができるようになる。 As a result, the neural network NN1 that constitutes the generator 802 can generate a pseudo-composition formula having the input characteristic values.

このとき、識別部８０３を構成するニューラルネットワークＮＮ２は、二値判別と特性値推定の２つの問題（タスク）を同時に解くことになる。これは、マルチタスク学習と呼ばれる技術である。マルチタスク学習は、同時に異なるタスクを解くことで、タスク間の知見を共有することができ、各タスクの精度を向上させることができる。 At this time, the neural network NN2 constituting the identification unit 803 simultaneously solves two problems (tasks) of binary discrimination and characteristic value estimation. This is a technique called multitask learning. Multi-task learning solves different tasks at the same time, so that knowledge between tasks can be shared and the accuracy of each task can be improved.

生成部８０２が生成する疑似組成式の精度は学習回数が増大するにつれて徐々に向上していくため、識別部８０３が解く問題の難易度は徐々に上昇していく。簡単な問題からニューラルネットワークに解かせ、徐々に問題の難易度を上昇させていく学習方法はカリキュラム学習と呼ばれている。したがって、本実施の形態では、識別部８０３にカリキュラム学習を実施させてもよい。カリキュラム学習では、簡易な問題が難易度の高い問題に対する制約の役割を果たす。そのため、識別部８０３は判別精度及び識別精度の高いニューラルネットワークＮＮ２を生成できる。 Since the accuracy of the pseudo-composition formula generated by the generation unit 802 gradually improves as the number of times of learning increases, the difficulty level of the problem to be solved by the identification unit 803 gradually increases. Curriculum learning is a learning method in which a simple problem is solved by a neural network and the difficulty level of the problem is gradually increased. Therefore, in this embodiment, the identification unit 803 may be caused to carry out curriculum learning. In curriculum learning, easy problems act as constraints to difficult problems. Therefore, the identification unit 803 can generate a neural network NN2 with high discrimination accuracy and identification accuracy.

このように、マルチタスク学習及びカリキュラム学習を実施することによって、識別部８０３は既知ＤＢ外のデータに対する汎化性能が向上する。そして、識別部８０３の汎化性能が高まると、生成部８０２の汎化性能も高めることができる。その結果、生成部８０２は、既知ＤＢ内に含まれない特性値が入力されたとしても、識別部８０３のニューラルネットワークＮＮ２が「本物」と誤判定してしまうような疑似組成式が生成できる。 By implementing multitask learning and curriculum learning in this way, the identification unit 803 improves generalization performance for data outside the known DB. When the generalization performance of the identification unit 803 is improved, the generalization performance of the generation unit 802 can also be improved. As a result, the generation unit 802 can generate a pseudo-composition formula that causes the neural network NN2 of the identification unit 803 to erroneously determine that it is "genuine" even if a characteristic value not included in the known DB is input.

図１に参照を戻し、入力部１０３について説明する。入力部１０３は、乱数発生部８０１における乱数の発生数、すなわち、生成したい疑似組成式の数を指定するユーザからの入力を受け付ける。また、入力部１０３は、合成可能性の評価対象となる組成式を指定するユーザからの入力を受け付ける。入力部１０３は、例えば、キーボード、及びマウスといった入力装置で構成される。 Referring back to FIG. 1, the input unit 103 will be described. The input unit 103 receives an input from the user specifying the number of random numbers generated by the random number generator 801, that is, the number of pseudo composition formulas to be generated. The input unit 103 also receives an input from the user who designates a composition formula to be evaluated for synthesis possibility. The input unit 103 is composed of input devices such as a keyboard and a mouse.

次に、材料情報出力装置１００について説明する。疑似組成式保持部１０４は、生成モデル学習部１０２が生成した疑似組成式を保持する。疑似組成式は、生成モデル学習部１０２が備える生成部８０２のニューラルネットワークＮＮ１の出力データＹのことを指し、「特徴量及び記述子」で構成される。通常、疑似組成式保持部１０４は、学習完了後に生成部８０２により生成された疑似組成式を保持するが、本実施の形態においては、学習過程において生成部８０２により生成された疑似組成式も保持する。 Next, the material information output device 100 will be described. The pseudo composition formula holding unit 104 holds the pseudo composition formula generated by the generative model learning unit 102 . The pseudo-composition formula refers to the output data Y of the neural network NN1 of the generating unit 802 provided in the generative model learning unit 102, and is composed of "feature amounts and descriptors." Normally, the pseudo-composition formula holding unit 104 holds the pseudo-composition formula generated by the generation unit 802 after the learning is completed. do.

学習過程においては、まず、生成部８０２及び識別部８０３のニューラルネットワークＮＮ１，ＮＮ２の荷重Ｗは、初期値が設定される。その後、初期値が設定されたニューラルネットワークＮＮ１，ＮＮ２は、前述した敵対的学習アルゴリズムにより学習を進めていく。このとき、生成部８０２は、ニューラルネットワークＮＮ１が各学習段階で生成した疑似組成式を学習段階を示す数値と対応付けて疑似組成式保持部１０４に保持させる。なお、ニューラルネットワークＮＮ１，ＮＮ２の荷重Ｗは、例えば、勾配降下法等で調整される。 In the learning process, first, initial values are set for the weights W of the neural networks NN1 and NN2 of the generation unit 802 and the identification unit 803 . After that, the neural networks NN1 and NN2 to which the initial values have been set proceed with learning by the above-described hostile learning algorithm. At this time, the generation unit 802 causes the pseudo composition formula storage unit 104 to store the pseudo composition formula generated in each learning stage by the neural network NN1 in association with the numerical value indicating the learning stage. The weights W of the neural networks NN1 and NN2 are adjusted by, for example, the gradient descent method.

例えば、乱数の発生数が１０００であるとすると、ニューラルネットワークＮＮ１は、１回の学習を行う度に１０００個の疑似組成式を生成する。したがって、総学習回数がＴ（Ｔは２以上の整数）回であるとすると、生成部８０２は、０回目の学習で生成した１０００個の疑似組成式、１回目の学習で生成した１０００個の疑似組成式、・・・、Ｔ回目の学習で生成した１０００個の疑似組成式というように、１０００×（Ｔ＋１）個の疑似組成式を学習段階を示す数値（０回目、１回目、・・・、Ｔ回目）と対応付けて疑似組成式保持部１０４に保持させる。なお、０回目の学習においては、荷重Ｗとして初期値が設定されたニューラルネットワークＮＮ１に対して、乱数及び既知組成式の特性値の１０００個のペアが入力され、各ペアに対応する１０００個の疑似組成式が生成される。 For example, if the number of random numbers generated is 1000, the neural network NN1 generates 1000 pseudo-composition formulas each time learning is performed. Therefore, assuming that the total number of times of learning is T (T is an integer of 2 or more), the generation unit 802 generates 1000 pseudo-composition formulas generated in the 0th learning and 1000 generated in the 1st learning. Pseudo-composition formulas, . , T-th time) and stored in the pseudo-composition-formula storage unit 104 . In the 0th learning, 1000 pairs of random numbers and characteristic values of known composition formulas are input to the neural network NN1 in which the initial value is set as the weight W, and 1000 pairs corresponding to each pair are input. A pseudo-formula is generated.

教師ラベル付与部１０５は、疑似組成式保持部１０４に保持されている各疑似組成式の合成可能性を、各疑似組成式に対応付けられた学習段階を示す数値を用いて算出し、算出した合成可能性を疑似組成式保持部１０４が保持する疑似組成式の教師ラベルとして付与する。 The teacher label assigning unit 105 calculates the synthesizability of each pseudo composition formula held in the pseudo composition formula holding unit 104 by using the numerical value indicating the learning stage associated with each pseudo composition formula. The synthesis possibility is given as a teaching label of the pseudo-composition formula held by the pseudo-composition formula holding unit 104 .

ここで、教師ラベル付与部１０５は、総学習回数Ｔに対する学習段階を示す数値ｔの割合を教師ラベルとして付与する。この場合、教師ラベルは０～１の値を持つ１次元の実数値となる。例えば３回目の学習で得られた疑似組成式の教師ラベルＬはＬ＝［３／Ｔ］となる。しかし、この教師ラベルの付与方法では、異なる数値ｔに対して同一の疑似組成式が生成された場合、同一の疑似組成式に対して複数の教師ラベルＬが付与されてしまう。 Here, the teacher label assigning unit 105 assigns the ratio of the numerical value t indicating the learning stage to the total number of learning times T as the teacher label. In this case, the teacher label is a one-dimensional real number with a value between 0 and 1. For example, the teacher label L of the pseudo-composition formula obtained in the third learning is L=[3/T]. However, in this teaching label assigning method, when the same pseudo-composition formula is generated for different numerical values t, a plurality of teaching labels L are assigned to the same pseudo-composition formula.

そこで、本実施の形態では、同一の疑似組成式に対して複数の教師ラベルＬが付与された場合、教師ラベル付与部１０５は、当該疑似組成式に対して、最後に生成されたときの教師ラベルＬ、すなわち、最大の教師ラベルＬを付与する。例えば、ある疑似組成式が、５回目、２０回目、８０回目のそれぞれの学習段階で生成されたとすると、その疑似組成式に対して付与される教師ラベルＬはＬ＝８０／Ｔとなる。これにより、教師ラベル付与部１０５は、疑似組成式と教師ラベルＬとを一対一で対応付けることができる。 Therefore, in this embodiment, when a plurality of teacher labels L are assigned to the same pseudo-composition formula, the teacher-label assigning unit 105 assigns the last generated teacher label to the pseudo-composition formula. A label L, that is, the maximum teacher label L is given. For example, if a certain pseudo-composition formula is generated in each of the 5th, 20th, and 80th learning stages, the teacher label L assigned to the pseudo-composition formula is L=80/T. As a result, the teacher label assigning unit 105 can associate the pseudo composition formula with the teacher label L on a one-to-one basis.

推定モデル学習部１０６は、教師ラベル付与部１０５により教師ラベルが付与された疑似組成式を学習することで、任意の組成式の合成可能性を推定するための推定モデルを生成する。ここで、推定モデル学習部１０６は、疑似組成式が何回目の学習で得られた疑似組成式であるかを示す０～１の実数値を推定する回帰問題を解くことによって推定モデルを生成する。すなわち、推定モデル学習部１０６は、入力データＸが疑似組成式の記述子及び特徴量、出力データＹが教師ラベルＬを示す実数値となる。 The estimation model learning unit 106 generates an estimation model for estimating the possibility of synthesizing an arbitrary composition formula by learning the pseudo-composition formula to which the teacher label is assigned by the teacher label assignment unit 105 . Here, the estimation model learning unit 106 generates an estimation model by solving a regression problem of estimating a real value of 0 to 1 that indicates how many times the pseudo composition formula is obtained by learning. . That is, in the estimation model learning unit 106, the input data X is the descriptor and feature amount of the pseudo-composition formula, and the output data Y is the real value indicating the teacher label L. FIG.

以下、疑似組成式が何回目の学習段階で得られたものであるかを推定することが合成可能性を判定することに一致する理由について説明する。 The reason why estimating the number of learning stages in which the pseudo-composition formula was obtained coincides with judging the possibility of synthesis will be described below.

前述したＣ－ＧＡＮは、既知組成式リスト１０１を用いて学習される。既知組成式リスト１０１は、既に存在が確認された既知組成式のリストである。すなわち、合成可能性の範囲を０（合成可能性低）～１（合成可能性高）の値とした場合、既知組成式リスト１０１に含まれる既知組成式の合成可能性は１となる。一方、生成部８０２を構成するニューラルネットワークＮＮ１は、学習の初期段階では既知組成式と大きく異なる疑似組成式を生成する。しかし、徐々に学習を繰り返していくことにより、ニューラルネットワークＮＮ１は、既知組成式に似通った疑似組成式を生成する。 The aforementioned C-GAN is trained using the known composition list 101 . The known composition formula list 101 is a list of known composition formulas whose existence has already been confirmed. That is, when the range of synthesis possibility is set to a value of 0 (low synthesis possibility) to 1 (high synthesis possibility), the synthesis possibility of the known composition formula included in the known composition formula list 101 is 1. On the other hand, the neural network NN1 that constitutes the generation unit 802 generates a pseudo-composition formula that is significantly different from the known composition formula at the initial stage of learning. However, by gradually repeating learning, the neural network NN1 generates a pseudo compositional formula similar to the known compositional formula.

このことから、学習の初期段階に得られる疑似組成式は合成可能性が低いが、学習の終了間際に得られる疑似組成式は合成可能性が高いと言える。従って、初期の学習で得られた疑似組成式、すなわち、教師ラベルＬとして、Ｌ＝［０／Ｔ］＝［０］が付与された疑似組成式は、最も合成可能性が低いことを表すのである。一方、Ｔ回目の学習で得られた疑似組成式、すなわち、教師ラベルＬとして、Ｌ＝［Ｔ／Ｔ］＝［１］が付与された疑似組成式は最も合成可能性が高いことを表すのである。 From this, it can be said that the pseudo-composition formulas obtained at the initial stage of learning have a low synthesizability, but the pseudo-composition formulas obtained just before the end of the learning have a high synthesizability. Therefore, the pseudo-composition formula obtained in the initial learning, that is, the pseudo-composition formula with L=[0/T]=[0] assigned as the teacher label L represents the lowest possibility of synthesis. be. On the other hand, the pseudo-composition formula obtained in the T-th learning, that is, the pseudo-composition formula with L=[T/T]=[1] assigned as the teacher label L represents the highest possibility of synthesis. be.

上述した教師ラベルＬの振り方は、合成可能性と学習段階を示す数値ｔとの関係が線形であると仮定している。但し、これは一例であり、本実施の形態においては、合成可能性と数値ｔとの関係は線形でなく、非線形であってもよい。例えば、ｔ回目の学習で得られた疑似組成式の合成可能性Ｚとすると、例えば、Ｚ＝ｓｉｇｍｏｉｄ（ｔ）により合成可能性は決定されてもよい。ここで、ｓｉｇｍｏｉｄ（ｔ）は、シグモイド関数を表す。この場合、教師ラベル付与部１０５は、合成可能性Ｚを疑似組成式に対する教師ラベルＬとして付与すればよい。 The assignment of the teacher label L described above assumes that the relationship between the possibility of synthesis and the numerical value t indicating the learning stage is linear. However, this is just an example, and in the present embodiment, the relationship between the possibility of synthesis and the numerical value t may be non-linear instead of linear. For example, assuming that the synthesizing possibility Z of the pseudo-composition formula obtained in the t-th learning, the synthesizing possibility may be determined by Z=sigmoid(t), for example. where sigmoid(t) represents the sigmoid function. In this case, the teacher label assigning unit 105 may assign the synthesizing possibility Z as the teacher label L to the pseudo composition formula.

本実の施形態では、推定モデル学習部１０６は、正例負例の二値判別ではなく、回帰問題として合成可能性を推定する。加えて、教師ラベルＬはニューラルネットワークＮＮ１の学習段階を示す数値ｔに基づいて付与される。生成問題を解くニューラルネットワークＮＮ１は、学習を繰り返すことによって、識別部８０３を構成するニューラルネットワークＮＮ２が本物の組成式と見分けがつかないような疑似組成式を生成していく。すなわち、生成部８０２が生成する疑似組成式は、合成可能性が段階的に高まっていく。 In the present embodiment, the estimation model learning unit 106 estimates the synthesis possibility as a regression problem instead of binary discrimination of positive and negative cases. In addition, the teacher label L is given based on the numerical value t indicating the learning stage of the neural network NN1. By repeating learning, the neural network NN1 that solves the generation problem generates pseudo-compositional formulas that make it indistinguishable from the real compositional formulas by the neural network NN2 that constitutes the identification unit 803 . That is, the pseudo-composition formula generated by the generation unit 802 has a stepwise increase in synthesizability.

さらに、本実施の形態においては、学習過程において、同一の疑似組成式が複数生成された場合、当該疑似組成式に対して最後に生成されたときの教師ラベルＬが付与される。例えば、学習過程の初期段階で合成可能性が高い疑似組成式が生成されたとする。この疑似組成式は、合成可能性が高いため、学習の中期～後期、終了間際においても高確率で生成部８０２により生成される。加えて、乱数の発生数を多くすればするほど、この疑似組成式の生成部８０２による発生確率は高まる。以上のことから、疑似組成式に対して最大の教師ラベルＬを付与することで、学習過程における合成可能性を示す教師ラベルの振り間違いによる学習効率の低下を抑制することができる。 Furthermore, in the present embodiment, when a plurality of identical pseudo-compositional formulas are generated in the learning process, the last-generated teacher label L is assigned to the pseudo-compositional formula. For example, assume that a pseudo-composition formula with a high possibility of synthesis is generated in the early stages of the learning process. Since this pseudo-composition formula has a high possibility of being synthesized, it is generated by the generation unit 802 with a high probability even in the middle to late stages of learning and just before the end of learning. In addition, as the number of random numbers generated increases, the probability of generation by the pseudo composition formula generator 802 increases. As described above, by assigning the maximum teacher label L to the pseudo-composition formula, it is possible to suppress a decrease in the learning efficiency due to misassignment of the teacher label indicating synthesis possibility in the learning process.

さらに、本実施の形態においては、推定モデル学習部１０６は、推定モデルの学習に生成部８０２にて生成された疑似組成式を利用している。生成部８０２は設定された乱数の発生数だけ疑似組成式を生成する。そのため、推定モデルの学習に必要なデータを大量に準備することできる。加えて、生成部８０２にて生成される疑似組成式は、既知組成式リスト１０１に含まれていない、すなわち、既知ＤＢに存在しない疑似組成式を多数含んでいる。これらにより、推定モデル学習部１０６は、推定モデルの過学習を抑制することができる。 Furthermore, in the present embodiment, estimation model learning section 106 uses the pseudo composition formula generated by generation section 802 for learning the estimation model. The generation unit 802 generates pseudo composition formulas as many as the set number of random numbers generated. Therefore, a large amount of data necessary for learning the estimation model can be prepared. In addition, the pseudo composition formula generated by the generating unit 802 includes many pseudo composition formulas that are not included in the known composition formula list 101, that is, do not exist in the known DB. With these, the estimation model learning unit 106 can suppress over-learning of the estimation model.

推定モデル保持部１０７は、推定モデル学習部１０６が生成した推定モデルを保持する。推定モデル保持部１０７により保持される情報としては、推定モデルを構成するニューラルネットワークの層の数及び各層に配置されたユニットの数を表す構成情報と、各ユニットでの荷重和計算に用いられる荷重値を表す荷重Ｗとが含まれる。 Estimation model holding unit 107 holds the estimation model generated by estimation model learning unit 106 . The information held by the estimated model holding unit 107 includes configuration information indicating the number of layers of the neural network that constitutes the estimated model and the number of units arranged in each layer, and the weights used for weight sum calculation in each unit. A weight W representing a value is included.

活用過程では、入力部１０３には合成可能性の評価対象となる組成式がユーザにより入力される。記述子算出部１０８は、入力部１０３に入力された組成式を特徴量及び記述子に変換し、合成可能性算出部１０９に入力する。ここで、記述子算出部１０８は、組成式の結晶構造を表すパラメータを算出するための所定の数式に評価対象となる組成式を入力することで記述子を算出すればよい。また、記述子算出部１０８は、評価対象となる組成式について、図３で示す一次元のベクトル３０３を算出することで、当該組成式の特徴量を算出ればよい。 In the utilization process, the user inputs a composition formula to be evaluated for synthesis possibility into the input unit 103 . The descriptor calculation unit 108 converts the composition formula input to the input unit 103 into feature amounts and descriptors, and inputs them to the synthesis possibility calculation unit 109 . Here, the descriptor calculation unit 108 may calculate the descriptor by inputting the composition formula to be evaluated into a predetermined formula for calculating the parameters representing the crystal structure of the composition formula. Further, the descriptor calculation unit 108 may calculate the feature amount of the composition formula to be evaluated by calculating the one-dimensional vector 303 shown in FIG.

合成可能性算出部１０９は、推定モデル保持部１０７に保持されている構成情報と荷重Ｗとで表される推定モデルを読み出す。そして、合成可能性算出部１０９は、推定モデルに対して、記述子算出部１０８により算出された特徴量及び記述子を入力データＸとして入力する。そして、合成可能性算出部１０９は、入力データＸが入力ユニットに与えられたときの推定モデルを構成する各ユニットでの荷重和を算出し、合成可能性を表す出力データＹを出力する。 The synthesizing possibility calculation unit 109 reads out the estimation model represented by the configuration information and the weight W held in the estimation model holding unit 107 . Then, the synthesizing possibility calculation unit 109 inputs the feature amount and the descriptor calculated by the descriptor calculation unit 108 as the input data X to the estimation model. The synthesizing possibility calculation unit 109 then calculates the weighted sum in each unit constituting the estimation model when the input data X is given to the input unit, and outputs the output data Y representing the synthesizing possibility.

表示部１１０は、例えば、液晶表示ディスプレイ等の表示装置及びプロセッサで構成され、合成可能性算出部１０９によって算出された合成可能性を示す画像を生成して表示する。 The display unit 110 includes, for example, a display device such as a liquid crystal display and a processor, and generates and displays an image showing the combining possibility calculated by the combining possibility calculating unit 109 .

図１０は、図１に示す材料情報出力システムの学習過程における処理の一例を示すフローチャートである。 FIG. 10 is a flow chart showing an example of processing in the learning process of the material information output system shown in FIG.

Ｓ１０１では、生成モデル学習部１０２は、既知組成式リスト１０１に含まれる既知組成式を読み込む。Ｓ１０２では、生成モデル学習部１０２は、乱数発生部８０１で生成された乱数とＳ１０１で読み込んだ既知組成式の特性値とのペアを生成部８０２のニューラルネットワークＮＮ１に入力し、疑似組成式を得る。例えば、乱数の発生数が１０００であるとすると、１０００個のペアがニューラルネットワークＮＮ１に入力されて１０００個の疑似組成式が生成される。 In S101 , the generative model learning unit 102 reads known composition formulas included in the known composition formula list 101 . In S102, the generative model learning unit 102 inputs a pair of the random number generated by the random number generation unit 801 and the characteristic value of the known composition formula read in S101 to the neural network NN1 of the generation unit 802 to obtain a pseudo composition formula. . For example, if the number of random numbers generated is 1000, 1000 pairs are input to the neural network NN1 to generate 1000 pseudo composition formulas.

Ｓ１０３では、生成モデル学習部１０２の生成部８０２は、ニューラルネットワークＮＮ１によって生成された疑似組成式を学習段階を示す数値ｔと対応付けて疑似組成式保持部１０４に保持させる。例えば、上記の１０００個の例では、１回の学習につき１０００個の疑似組成式が生成されるので、１つの数値ｔに対して１０００個の疑似組成式が対応付けて保持される。 In S103, the generating unit 802 of the generative model learning unit 102 associates the pseudo compositional formula generated by the neural network NN1 with the numerical value t indicating the learning stage and stores it in the pseudo compositional formula holding unit 104. FIG. For example, in the above example of 1000, since 1000 pseudo-composition formulas are generated for one learning, 1000 pseudo-composition formulas are stored in association with one numerical value t.

Ｓ１０４では、生成モデル学習部１０２の識別部８０３は、疑似組成式と既知組成式とをニューラルネットワークＮＮ２に入力する。例えば、識別部８０３は、ｔ回目の学習において生成部８０２で生成された１０００個の疑似組成式と、既知組成式リスト１０１に含まれる既知組成式とを交互にニューラルネットワークＮＮ２に入力すればよい。 In S104, the identification unit 803 of the generative model learning unit 102 inputs the pseudo composition formula and the known composition formula to the neural network NN2. For example, the identification unit 803 may alternately input the 1000 pseudo composition formulas generated by the generation unit 802 in the t-th learning and the known composition formulas included in the known composition formula list 101 to the neural network NN2. .

その後、Ｓ１０５からＳ１０７の処理が並列で実施される。 After that, the processes from S105 to S107 are performed in parallel.

Ｓ１０５では、識別部８０３は、ニューラルネットワークＮＮ２に入力された既知組成式の特性値を推定し、実際の特性値との誤差を計算する。実際の特性値とは、特性値が推定された既知組成式に対応する既知組成式リスト１０１が保持する特性値のことを指す。 In S105, the identification unit 803 estimates the characteristic value of the known composition formula input to the neural network NN2, and calculates the error from the actual characteristic value. The actual characteristic value refers to the characteristic value held in the known composition formula list 101 corresponding to the known composition formula for which the characteristic value is estimated.

Ｓ１０６では、識別部８０３は、ニューラルネットワークＮＮ２に入力された既知組成式と疑似組成式とのそれぞれについて二値判別を実施し、正答率を算出する。 In S106, the identification unit 803 performs binary discrimination on each of the known composition formula and the pseudo composition formula input to the neural network NN2, and calculates the percentage of correct answers.

Ｓ１０７では、識別部８０３は、ニューラルネットワークＮＮ２に入力された疑似組成式の特性値を推定し、推定した特性値と、Ｓ１０２において乱数とペアで入力された特性値との誤差を算出する。ここで、乱数とペアで入力された特性値とは、特性値が推定された疑似組成式を生成する際に生成部８０２に入力された既知組成式の特性値のことを指す。 In S107, the identification unit 803 estimates the characteristic value of the pseudo-composition formula input to the neural network NN2, and calculates the error between the estimated characteristic value and the characteristic value input in S102 as a pair with the random number. Here, the characteristic value input in a pair with the random number refers to the characteristic value of the known composition formula input to the generator 802 when generating the pseudo composition formula in which the characteristic value is estimated.

Ｓ１０８では、識別部８０３は、Ｓ１０５で算出された誤差が小さくなり、且つ、Ｓ１０６で算出された正答率が高くなるように、ニューラルネットワークＮＮ２の荷重Ｗを更新する。 In S108, the identification unit 803 updates the weight W of the neural network NN2 so that the error calculated in S105 becomes smaller and the correct answer rate calculated in S106 increases.

Ｓ１０９では、生成部８０２は、Ｓ１０６で算出された正答率が低くなり、且つ、Ｓ１０７で算出された誤差が小さくなるように、ニューラルネットワークＮＮ１の荷重Ｗを更新する。ここで、識別部８０３は、疑似組成式の二値判別結果は生成部８０２に出力するが、既知組成式の二値判別結果は生成部８０２に出力しない。したがって、生成部８０２には、識別部８０３の疑似組成式に対する二値判別結果を示す出力データＹ［本物、偽物］が入力される。 In S109, the generation unit 802 updates the weight W of the neural network NN1 so that the percentage of correct answers calculated in S106 decreases and the error calculated in S107 decreases. Here, the identification unit 803 outputs the binary discrimination result of the pseudo composition formula to the generation unit 802 , but does not output the binary discrimination result of the known composition formula to the generation unit 802 . Therefore, output data Y [genuine, counterfeit] indicating the binary determination result of the pseudo composition formula of the identification unit 803 is input to the generation unit 802 .

生成部８０２が生成した疑似組成式に対して、識別部８０３が本物と判別した場合、すなわち、識別部８０３が誤判別をした場合、生成部８０２は、より本物に近い疑似組成式を生成できていると言える。したがって、疑似組成式に対する出力データＹ［本物、偽物］と正解ラベル［１，０］との距離Ｄが小さいほど、生成部８０２は、より本物に近い疑似組成式を生成できていると言える。そこで、生成部８０２は、距離Ｄが小さくなるようにニューラルネットワークＮＮ１の荷重Ｗを調整することで、Ｓ１０６で算出された正答率を低くしている。 If the identification unit 803 determines that the pseudo composition formula generated by the generation unit 802 is genuine, that is, if the identification unit 803 makes an erroneous determination, the generation unit 802 can generate a pseudo composition formula that is closer to the real thing. It can be said that Therefore, it can be said that the smaller the distance D between the output data Y [genuine, fake] for the pseudo composition formula and the correct label [1, 0], the more realistic the pseudo composition formula the generation unit 802 can generate. Therefore, the generating unit 802 adjusts the weight W of the neural network NN1 so that the distance D becomes smaller, thereby lowering the percentage of correct answers calculated in S106.

Ｓ１１０では、生成モデル学習部１０２は、ニューラルネットワークＮＮ１，ＮＮ２の学習回数が総学習回数Ｔに到達すると、処理をＳ１１１に遷移させ、総学習回数Ｔに到達していなければ、処理をＳ１０２に遷移させる。 In S110, the generative model learning unit 102 shifts the process to S111 when the number of learning times of the neural networks NN1 and NN2 reaches the total number of learning times T, and shifts the process to S102 if the total number of learning times T has not been reached. Let

Ｓ１１１では、教師ラベル付与部１０５は、疑似組成式保持部１０４に保持されている疑似組成式に対して、教師ラベルＬを付与する。 In S111 , the teacher label assigning unit 105 assigns a teacher label L to the pseudo composition formula held in the pseudo composition formula holding unit 104 .

Ｓ１１２では、推定モデル学習部１０６は、疑似組成式から教師ラベルＬの値を推定する回帰問題を解くニューラルネットワークを学習する。Ｓ１１３では、推定モデル学習部１０６は、Ｓ１１２で学習されたニューラルネットワークを推定モデルとして推定モデル保持部１０７に保持させる。以上により学習過程が終了される。 In S112, the estimation model learning unit 106 learns a neural network that solves the regression problem of estimating the value of the teacher label L from the pseudo-composition formula. In S113, the estimation model learning unit 106 causes the estimation model holding unit 107 to hold the neural network learned in S112 as an estimation model. The learning process is completed by the above.

図１１は、図１に示す材料情報出力システムの活用過程における処理の一例を示すフローチャートである。Ｓ２０１では、入力部１０３は、合成可能性の評価対象となる組成式が入力される。Ｓ２０２では、合成可能性算出部１０９は、推定モデルを推定モデル保持部１０７から読み込む。Ｓ２０３では、記述子算出部１０８は、Ｓ２０１にて入力された組成式の特徴量及び記述子を算出する。 FIG. 11 is a flow chart showing an example of processing in the utilization process of the material information output system shown in FIG. In S201, the input unit 103 receives a composition formula to be evaluated for synthesis possibility. In S202 , the synthesis possibility calculation unit 109 reads the estimated model from the estimated model storage unit 107 . In S203, the descriptor calculation unit 108 calculates the feature amount and descriptor of the composition formula input in S201.

Ｓ２０４では、合成可能性算出部１０９は、Ｓ２０３で算出された特徴量及び記述子を推定モデルに入力し、合成可能性を算出する。Ｓ２０５では、表示部１１０は、算出された合成可能性を示す画像を表示する。以上により活用過程が終了される。 In S204, the synthesis possibility calculation unit 109 inputs the feature amount and the descriptor calculated in S203 to the estimation model, and calculates the synthesis possibility. In S205, the display unit 110 displays an image indicating the calculated combination possibility. The utilization process is completed by the above.

図１２は、合成可能性をユーザに提示するための画像Ｇ１の一例を示す図である。画像Ｇ１には、ユーザが入力した組成式に対する合成可能性の数値が表示されている。これにより、ユーザは、評価対象となる組成式の合成可能性を容易に認識できる。 FIG. 12 is a diagram showing an example of an image G1 for presenting synthesis possibilities to the user. The image G1 displays the numerical value of the synthesis possibility for the composition formula input by the user. This allows the user to easily recognize the possibility of synthesizing the composition formula to be evaluated.

以上説明したように、実施の形態１に係る材料情報出力システムによれば、疑似組成式に対して適切な合成可能性を示す教師ラベルＬを付与することができるので、教師ラベルＬの振り間違いを抑制することができる。そして、適切な教師ラベルＬが付与された疑似組成式を用いて学習された推定モデルを用いて合成可能性が算出されているため、合成可能性の適切な値を材料研究者に提示することができる。その結果、実際には合成可能な未知の組成式が合成不可能と判断されることを防止できる。 As described above, according to the material information output system according to Embodiment 1, it is possible to assign a teacher label L indicating an appropriate synthesis possibility to a pseudo composition formula. can be suppressed. Then, since the synthesis possibility is calculated using the estimated model learned using the pseudo-composition formula to which the appropriate teacher label L is assigned, the appropriate value of the synthesis possibility is presented to the materials researcher. can be done. As a result, it is possible to prevent an unknown composition formula that is actually synthesizable from being determined to be unsynthesizable.

また、本実施の形態における材料情報出力装置は、推定モデルの構築に既知組成式リスト１０１を明示的に利用していない。すなわち、教師ラベルＬの付与されていない、未知の組成式を含む疑似組成式が入力として与えられ、合成可能性を推定する推定モデルが学習される。従って、推定モデルを学習する際には、既知組成式リスト１０１を保持しておく必要はない。これにより、既知組成式リスト１０１を保持する企業は、企業内の実験データ等の秘匿情報として既知組成式リスト１０１を社内に残しつつ、推定モデルの学習を行う材料情報出力装置１００は外部のサービス提供企業のコンピュータで構築できるようになる。すなわち、既知組成式リスト１０１を保持する企業は、既知組成式リスト１０１を提供することなくサービス提供企業に対して材料情報出力装置１００を用いた材料情報に関するサービスを実施させることができる。 Further, the material information output device according to the present embodiment does not explicitly use the known composition formula list 101 for constructing the estimated model. That is, a pseudo composition formula including an unknown composition formula to which no teacher label L is assigned is given as an input, and an estimation model for estimating synthesis possibility is learned. Therefore, it is not necessary to hold the known composition formula list 101 when learning the estimation model. As a result, the company holding the known composition formula list 101 can leave the known composition formula list 101 as confidential information such as experimental data in the company, while the material information output device 100 for learning the estimation model can be provided as an external service. You will be able to build on the computer of the provider company. That is, the company holding the known composition formula list 101 can allow the service providing company to provide a material information service using the material information output device 100 without providing the known composition formula list 101 .

この態様を実現するには、図１において、材料情報出力装置１００に含まれていない残りのブロックを材料情報出力装置１００とは別の第１コンピュータで構成すればよい。そして、材料情報出力装置１００とこの第１コンピュータとをインターネット等のネットワークを介して通信可能に接続すればよい。或いは、図１において入力部１０３及び表示部１１０を更に別の第２コンピュータで構成してもよい。この場合、第２コンピュータはサービス提供企業からサービスが提供されるユーザが保持するコンピュータである。これにより、サービス提供企業は、既知組成式リスト１０１を保持する企業以外のユーザに対してサービスを提供することができる。 In order to realize this aspect, the remaining blocks not included in the material information output device 100 in FIG. Then, the material information output device 100 and the first computer may be communicably connected via a network such as the Internet. Alternatively, the input unit 103 and the display unit 110 in FIG. 1 may be configured by another second computer. In this case, the second computer is a computer held by a user to whom services are provided by the service provider company. As a result, the service providing company can provide services to users other than the company holding the known composition list 101 .

（実施の形態１の変形例１）
図１３は、実施の形態１の変形例に係る材料情報出力システムの構成の一例を示すブロック図である。この変形例では、教師ラベル付与部１０５は、既知組成式リスト１０１に含まれる既知組成式を読み込む。教師ラベル付与部１０５は、既知組成式リスト１０１に含まれる既知組成式に対して合成可能性の最大値である「１」を教師ラベルＬとして付与する。一方、教師ラベル付与部１０５は、疑似組成式に対しては、合成可能性の上限値を「１－ε」に設定して教師ラベルＬを付与する。 (Modification 1 of Embodiment 1)
13 is a block diagram showing an example of a configuration of a material information output system according to a modification of Embodiment 1. FIG. In this modification, the teacher label assigning unit 105 reads known composition formulas included in the known composition formula list 101 . The teacher label assigning unit 105 assigns a teacher label L of “1”, which is the maximum synthesizable value, to the known composition formulas included in the known composition formula list 101 . On the other hand, the teacher label assigning unit 105 assigns the teacher label L to the pseudo composition formula by setting the upper limit of the possibility of synthesis to "1-ε".

これにより、推定モデル学習部１０６は、教師ラベルＬが付与された疑似組成式のみならず、教師ラベルＬが付与された既知組成式も学習データとして利用して推定モデルの学習を行うことができる。なお、εの値としては、例えばε＝０．００１などの値が設定されるが、これは一例であり、適宜変更可能である。具体的には、教師ラベル付与部１０５は、疑似組成式に対して算出した教師ラベルＬに１－εを乗じることで教師ラベルＬを補正すればよい。 As a result, the estimation model learning unit 106 can learn the estimation model using not only the pseudo-composition formula to which the teacher label L is assigned but also the known composition formula to which the teacher label L is assigned as learning data. . As the value of ε, for example, a value such as ε=0.001 is set, but this is an example and can be changed as appropriate. Specifically, the teacher label assigning unit 105 may correct the teacher label L by multiplying the teacher label L calculated for the pseudo composition formula by 1-ε.

これにより、確実に合成可能である既知組成式を考慮に入れて、推定モデルを学習させることができ、推定モデルの推定精度の向上を図ることができる。 As a result, the estimation model can be learned by taking into account the known composition formula that can be reliably synthesized, and the estimation accuracy of the estimation model can be improved.

（実施の形態１の変形例２）
教師ラベル付与部１０５は、学習過程の初期に生成された疑似組成式、すなわち、学習段階を示す数値ｔが０の疑似組成式と、学習終了時に生成された疑似組成式、すなわち、学習段階を示す数値ｔがＴの疑似組成式とにのみ教師ラベルＬを付与し、これらの疑似組成式のみを疑似組成式保持部１０４に保持させてもよい。これにより、疑似組成式保持部１０４のメモリの空き容量を確保できる。 (Modification 2 of Embodiment 1)
The teacher label assigning unit 105 divides the pseudo-composition formula generated at the beginning of the learning process, that is, the pseudo-composition formula in which the numerical value t indicating the learning stage is 0, and the pseudo-composition formula generated at the end of the learning, that is, the learning stage. A teacher label L may be given only to the pseudo composition formulas whose indicated numerical value t is T, and only these pseudo composition formulas may be held in the pseudo composition formula holding unit 104 . This makes it possible to secure a free space in the memory of the pseudo composition formula holding unit 104 .

（実施の形態２）
実施の形態２に係る材料情報出力システムは、合成可能性を推定したい材料の種類を考慮に入れて、推定モデルを生成するものである。図１４は、本開示の実施の形態２に係る材料情報出力システムの構成の一例を示すブロック図である。なお、実施の形態２において、実施の形態１と同一の構成要素には同一の符号を付し、詳細な説明は省略する。 (Embodiment 2)
The material information output system according to Embodiment 2 generates an estimation model in consideration of the type of material whose synthesizability is to be estimated. FIG. 14 is a block diagram showing an example configuration of a material information output system according to Embodiment 2 of the present disclosure. In addition, in Embodiment 2, the same code|symbol is attached|subjected to the component same as Embodiment 1, and detailed description is abbreviate|omitted.

図１４において図１との相違点は、組み合わせ算出部１３０１と組み合わせ保持部１３０２とが新たに追加されている点にある。組み合わせ算出部１３０１は、入力部１０３で入力された評価対象となる材料の種類情報が示す種類と同じ種類の既知組成式を第１既知組成式として既知組成式リスト１０１から抽出する。そして、組み合わせ算出部１３０１は、後述する投票処理を実行し、得られた投票結果を組み合わせ保持部１３０２に記憶させる。 14 differs from FIG. 1 in that a combination calculation unit 1301 and a combination storage unit 1302 are newly added. The combination calculation unit 1301 extracts from the known composition formula list 101 the known composition formula of the same type as the type indicated by the type information of the material to be evaluated input by the input unit 103 as the first known composition formula. Then, the combination calculating unit 1301 executes voting processing, which will be described later, and causes the combination holding unit 1302 to store the obtained voting results.

組み合わせ保持部１３０２は、例えば、半導体メモリで構成され、組み合わせ算出部１３０１が生成した投票結果を保持する。 The combination holding unit 1302 is configured by, for example, a semiconductor memory, and holds the voting results generated by the combination calculation unit 1301 .

教師ラベル付与部１０５は、疑似組成式保持部１０４が保持する疑似組成式と、組み合わせ保持部１３０２が保持する投票結果とを照らし合わせ、疑似組成式保持部１０４が保持する疑似組成式の中から第１疑似組成式を抽出する。そして、教師ラベル付与部１０５は、第１疑似組成式のそれぞれに対して実施の形態１と同一の手法を用いて教師ラベルを付与する。 The teacher label assigning unit 105 compares the pseudo composition formula held by the pseudo composition formula holding unit 104 with the voting results held by the combination holding unit 1302, and selects a pseudo composition formula from among the pseudo composition formulas held by the pseudo composition formula holding unit 104. A first pseudo-composition formula is extracted. Then, the teacher label assigning unit 105 assigns a teacher label to each of the first pseudo compositional formulas using the same method as in the first embodiment.

図１５は、図１４に示す材料情報出力システムの学習過程における処理の一例を示すフローチャートである。なお、図１５のフローにおいて図１０と同一の処理には同一のステップ番号を付している。 15 is a flowchart showing an example of processing in the learning process of the material information output system shown in FIG. 14. FIG. In the flow of FIG. 15, the same step numbers are assigned to the same processes as in FIG.

Ｓ３０１では、組み合わせ算出部１３０１は、既知組成式リスト１０１に含まれる既知組成式と、ユーザが入力した評価対象の材料の種類を示す種類情報とを取得する。ここで、種類情報とは、例えば、リチウムイオン電池及びマグネシウム二次電池等の材料を分類する情報である。なお、種類情報は、固体電池、太陽電池、及び熱電材料等の情報が採用されてもよい。 In S301, the combination calculation unit 1301 acquires the known composition formula included in the known composition formula list 101 and the type information indicating the type of material to be evaluated input by the user. Here, the type information is, for example, information for classifying materials such as lithium ion batteries and magnesium secondary batteries. Information such as a solid battery, a solar battery, and a thermoelectric material may be used as the type information.

以降、実施の形態１で説明したＳ１０２からＳ１１０までの処理とＳ３０２との処理とが並列に実行される。Ｓ３０２では、組み合わせ算出部１３０１は、投票処理を実行する。図１６は、投票処理の詳細を示すフローチャートである。 Thereafter, the processing from S102 to S110 and the processing of S302 described in Embodiment 1 are executed in parallel. In S302, the combination calculation unit 1301 executes voting processing. FIG. 16 is a flowchart showing the details of voting processing.

Ｓ１６０１では、組み合わせ算出部１３０１は、既知組成式リスト１０１からＳ３０１で取得された種類情報が示す種類と同一種類の既知組成式を第１既知組成式として抽出する。 In S1601, the combination calculation unit 1301 extracts from the known composition formula list 101 the known composition formula of the same type as the type indicated by the type information acquired in S301 as the first known composition formula.

図１８は、実施の形態２に係る既知組成式リスト１０１のデータ構成の一例を示す図である。図１８においては、図２に対して更に「種類」が設けられている。「種類」は、「リチウムイオン電池、マグネシウム電池といった各組成式が示す材料の種類情報である。例えば、入力された種類情報がリチウムイオン電池であれば、「種類」がリチウムイオン電池である既知組成式が第１既知組成式として抽出される。 FIG. 18 is a diagram showing an example of the data configuration of the known composition formula list 101 according to the second embodiment. In FIG. 18, "type" is further provided with respect to FIG. "Type" is information about the type of material indicated by each composition formula, such as "lithium ion battery" or "magnesium battery." A compositional formula is extracted as a first known compositional formula.

図１６に参照を戻す。Ｓ１６０２では、組み合わせ算出部１３０１は、第１既知組成式のそれぞれについて元素の族の組み合わせを特定する。 Refer back to FIG. In S1602, the combination calculation unit 1301 identifies a group combination of elements for each of the first known compositional formulas.

元素の族とは、元素の周期表の縦１列に相当するクラスタのことを指すが、本実施の形態では、元素の族は、図１７に示すように周期表を７つのクラスタに分類することで構成される。図１７は、元素の族を説明する図である。図１７に示すように、本実施の形態では、元素の族は、アルカリ金属１４０１と、アルカリ土類金属１４０２と、ハロゲン１４０３と、希ガス１４０４と、遷移元素１４０５と、その他の金属元素１４０６と、その他の非金属元素１４０７とで構成される。 A group of elements refers to a cluster corresponding to one vertical column of the periodic table of elements. In this embodiment, the group of elements classifies the periodic table into seven clusters as shown in FIG. It consists of FIG. 17 is a diagram illustrating groups of elements. As shown in FIG. 17, in this embodiment, the groups of elements are alkali metals 1401, alkaline earth metals 1402, halogens 1403, rare gases 1404, transition elements 1405, and other metal elements 1406. , and other nonmetallic elements 1407 .

元素の族の組み合わせとは、二元系であれば、図１６に示す７つの元素の族のうちの任意の２つの元素の族の組み合わせが該当し、三元系であれば、図１６に示す７つの元素の族のうちの任意の３つの元素の族の組み合わせが該当する。したがって、二元系であれば、元素の族の組み合わせは_７Ｃ_２＝２１通り存在し、三元系であれば、元素の族の組み合わせは_７Ｃ_３＝７０通り存在する。 The combination of element groups corresponds to a combination of arbitrary two element groups among the seven element groups shown in FIG. 16 for a binary system, and a combination shown in FIG. Combinations of arbitrary three groups of elements among the seven groups of elements shown apply. Therefore, in the binary system, there are ₇ C ₂ =21 combinations of element groups, and in the ternary system, there are ₇ C ₃ =70 combinations of element groups.

例えば、Ａｌ_２Ｏ_３とＭｇＯとは共に、元素の族の組み合わせが「その他の金属元素１４０６」と「その他の非金属元素１４０７」とであるため、元素の族の組み合わせは同一となる。一方、ＭｇＣｌ_２は、元素の族の組み合わせが「その他の金属元素１４０６」と「ハロゲン１４０３」とであるため、Ａｌ_２Ｏ_３とは元素の族の組み合わせが異なるものとなる。 For example, both Al ₂ O ₃ and MgO have the same element group combination because the element group combination is “other metal elements 1406” and “other non-metal elements 1407”. On the other hand, MgCl ₂ has a combination of element groups of “other metal element 1406” and “halogen 1403”, and therefore has a different combination of element groups from Al ₂ O ₃ .

図１６に参照を戻す。Ｓ１６０３では、組み合わせ算出部１３０１は、周期表から得られる予め定められた元素の族の組み合わせに対して、第１既知組成式の元素の族の組み合わせを投票する。 Refer back to FIG. In S1603 , the combination calculation unit 1301 votes for the combination of the group of elements of the first known compositional formula with respect to the predetermined combination of the group of elements obtained from the periodic table.

例えば、第１既知組成式にＡｌ_２Ｏ_３が含まれているとすると、「その他の金属元素１４０６」と「その他の非金属元素１４０７」との元素の組み合わせに対して１票が投じられる。第１既知組成式にＭｇＣｌ_２が含まれているとすると、「その他の金属元素１４０６」と「ハロゲン１４０３」との元素の組み合わせに対して１票が投じられる。なお、周期表から得られる予め定められた元素の族の組み合わせとは、二元系であれば上述の２１通りの元素の族の組み合わせが該当し、三元系であれば上述の７０通りの元素の族の組み合わせが該当する。したがって、投票結果は、例えば、「その他の金属元素１４０６」と「ハロゲン１４０３」との元素の族の組み合わせについてａ票、「その他の金属元素１４０６」と「その他の非金属元素１４０７」との元素の族の組み合わせについてｂ票というように、予め定められた元素の族の組み合わせと、それぞれの元素の族の組み合わせに対する投票数とが対応付けられたデータ構成を持つことになる。 For example, if Al ₂ O ₃ is included in the first known compositional formula, one vote is cast for the combination of elements “other metallic element 1406” and “other non-metallic element 1407”. Assuming MgCl2 is included in the _first known compositional formula, one vote is cast for the combination of elements "Other metal elements 1406" and "Halogen 1403". The combination of predetermined element groups obtained from the periodic table corresponds to the above-mentioned 21 combinations of element groups in the case of a binary system, and the above-mentioned 70 combinations in the case of a ternary system. Combinations of groups of elements apply. Therefore, the voting result is, for example, a vote for the combination of the element group of "other metal element 1406" and "halogen 1403", the element of "other metal element 1406" and "other non-metal element 1407" Thus, there is a data structure in which predetermined element group combinations and the number of votes for each element group combination are associated with each other, such as b votes for a group combination.

Ｓ１６０４では、組み合わせ算出部１３０１は、投票結果を組み合わせ保持部１３０２に保持させる。以上により投票処理が終了される。 In S1604, the combination calculation unit 1301 causes the combination holding unit 1302 to hold the voting results. Voting processing is completed as described above.

図１５に参照を戻す。Ｓ３０３では、教師ラベル付与部１０５は、疑似組成式保持部１０４に保持されている疑似組成式のうち、元素の族の組み合わせが、組み合わせ保持部１３０２に保持されている投票結果が０でない元素の族の組み合わせを持つ疑似組成式を第１疑似組成式として抽出する。そして、教師ラベル付与部１０５は、第１疑似組成式に対して実施の形態１と同じ手法を用いて教師ラベルＬを付与する。 Refer back to FIG. In S303 , the teacher label assigning unit 105 selects, among the pseudo-composition formulas held in the pseudo-composition formula holding unit 104 , the combination of the group of elements of the elements for which the voting result held in the combination holding unit 1302 is not 0. A pseudo-composition formula having a combination of groups is extracted as a first pseudo-composition formula. Then, the teacher label assigning unit 105 assigns the teacher label L to the first pseudo compositional formula using the same method as in the first embodiment.

以後、Ｓ１１２、Ｓ１１３の処理が実施され、Ｓ３０３で教師ラベルＬが付与された疑似組成式を用いて推定モデルが生成される。 After that, the processes of S112 and S113 are performed, and an estimated model is generated using the pseudo-composition formula to which the teacher label L is assigned in S303.

元素の族が異なると、元素の持つ特徴が大きく変化する。加えて、機械学習において、評価対象の材料に関連するデータのデータ数の比率が他の材料と比べて少ない場合、評価対象の材料の合成可能性の推定精度が低下する。そこで、本実施の形態では、評価対象の材料と種類が同じである第１既知組成式が抽出され、第１既知組成式と元素の族の組み合わせが同じである第１疑似組成式を用いて推定モデルが生成されている。そのため、本実施の形態では、評価対象の材料（組成式）の合成可能性の推定精度を向上させることができる。 Different groups of elements have different characteristics of the elements. In addition, in machine learning, when the ratio of the number of data related to the material to be evaluated is small compared to other materials, the accuracy of estimating the possibility of synthesizing the material to be evaluated decreases. Therefore, in the present embodiment, a first known composition formula having the same type as the material to be evaluated is extracted, and a first pseudo composition formula having the same combination of the first known composition formula and the group of elements is used. An estimation model has been generated. Therefore, in the present embodiment, it is possible to improve the accuracy of estimating the possibility of synthesizing the material (composition formula) to be evaluated.

一方で、生成モデル学習部１０２における生成部８０２の学習においては、様々な種類のデータを利用した方が、これまでの経験では思いもつかない疑似組成式を生成する可能性が高まるため、データを限定すべきではない。そこで、本実施の形態では、実施の形態１と同様、生成モデル学習部１０２は既知組成式リスト１０１に含まれる全ての既知組成式を利用して生成モデルを生成する。 On the other hand, in the learning of the generating unit 802 in the generative model learning unit 102, the use of various types of data increases the possibility of generating pseudo-composition formulas that have not been conceived through previous experience. should not be limited. Therefore, in the present embodiment, as in the first embodiment, the generative model learning unit 102 uses all the known compositional formulas included in the known compositional formula list 101 to generate a generative model.

以上説明したように、実施の形態２に係る材料情報出力システムによれば、評価対象の材料の種類をユーザに選択できるようにすることによって、評価対象の材料（組成式）の合成可能性の推定精度を向上させることができる。 As described above, according to the material information output system according to the second embodiment, by allowing the user to select the type of the material to be evaluated, the possibility of synthesizing the material (composition formula) to be evaluated is determined. Estimation accuracy can be improved.

（実施の形態３）
実施の形態３に係る材料情報出力システムは、疑似組成式の形成エネルギーを推定し、推定結果に応じて疑似組成式に付与された教師ラベルＬを補正するものである。 (Embodiment 3)
The material information output system according to Embodiment 3 estimates the formation energy of the pseudo composition formula, and corrects the teacher label L given to the pseudo composition formula according to the estimation result.

形成エネルギーとは、物質が安定に存在しうるかに関わる特性値である。形成エネルギーが低すぎる場合は、物質の安定性が低下する。一方、形成エネルギーが高すぎる場合も物質の安定性は低下する。そのため、適度な形成エネルギーを持つことで、物質は安定に存在することができる。物質が安定に存在しうるということは、合成の容易さを意味する。そのため、本実施の形態においては、疑似組成式の形成エネルギーを推定し、推定結果を用いて教師ラベルＬを補正する。 Formation energy is a characteristic value related to whether a substance can exist stably. If the energy of formation is too low, the material becomes less stable. On the other hand, if the energy of formation is too high, the stability of the material will also decrease. Therefore, substances can exist stably by having appropriate formation energy. The ability of a substance to exist stably means ease of synthesis. Therefore, in the present embodiment, the formation energy of the pseudo-composition formula is estimated, and the teacher label L is corrected using the estimation result.

図１９は、本開示の実施の形態３に係る材料情報出力システムの構成の一例を示すブロック図である。なお、本実施の形態において、実施の形態１、２と同一の構成要素には同一の符号を付し、説明を省略する。 FIG. 19 is a block diagram showing an example configuration of a material information output system according to Embodiment 3 of the present disclosure. In addition, in this embodiment, the same reference numerals are assigned to the same constituent elements as in the first and second embodiments, and the description thereof is omitted.

図１９において、図１との相違点は、エネルギー推定モデル学習部１６０１と、エネルギー推定モデル保持部１６０２とが新たに追加されている点にある。本実の施形態においては、既知組成式リスト１０１に含まれる既知組成式の特性値には、形成エネルギーが含まれているものとする。すなわち、本実施の形態では、既知組成式の形成エネルギーがメモリに事前に記憶されている。 19 differs from FIG. 1 in that an energy estimation model learning unit 1601 and an energy estimation model holding unit 1602 are newly added. In this embodiment, it is assumed that the characteristic values of the known composition formulas included in the known composition formula list 101 include formation energy. That is, in the present embodiment, the formation energy of the known compositional formula is pre-stored in the memory.

エネルギー推定モデル学習部１６０１は、既知組成式リスト１０１から各既知組成式について、特徴量及び記述子と形成エネルギーとのペアの情報を読み込み、形成エネルギーを推定するためのニューラルネットワークの荷重Ｗを調整することでエネルギー推定モデルを生成する。 The energy estimation model learning unit 1601 reads the pair information of the feature quantity, the descriptor, and the formation energy for each known composition formula from the known composition formula list 101, and adjusts the weight W of the neural network for estimating the formation energy. to generate an energy estimation model.

このとき、形成エネルギーは実数値となるため、このニューラルネットワークは回帰問題を解くこととなる。すなわち、エネルギー推定モデルは、特徴量及び記述子を入力データＸとし、形成エネルギーを出力データＹとするニューラルネットワークである。 At this time, since the formation energy becomes a real value, this neural network solves the regression problem. That is, the energy estimation model is a neural network in which input data X is the feature quantity and the descriptor, and output data Y is the formation energy.

そして、エネルギー推定モデル学習部１６０１は、学習の結果得られたエネルギー推定モデルに、既知組成式リスト１０１に含まれる各既知組成式の特徴量及び記述子を入力することで、各既知組成式に対する形成エネルギーを得る。そして、エネルギー推定モデル学習部１６０１は、得られた各既知組成式の形成エネルギーの平均と分散とを算出する。ここでは、既知エネルギーの平均と分散との算出にはエネルギー推定モデルが推定した形成エネルギーが用いられたが、既知組成式リスト１０１に記憶されている形成エネルギーが用いられてもよい。 Then, the energy estimation model learning unit 1601 inputs the feature values and descriptors of each known composition formula included in the known composition formula list 101 to the energy estimation model obtained as a result of learning, Obtain formation energy. Then, the energy estimation model learning unit 1601 calculates the average and variance of the formation energies of the obtained known composition formulas. Here, the formation energy estimated by the energy estimation model is used to calculate the average and variance of the known energies, but the formation energies stored in the known composition formula list 101 may be used.

荷重Ｗの調整が完了したニューラルネットワークは、エネルギー推定モデルとしてエネルギー推定モデル保持部１６０２に保持される。エネルギー推定モデル保持部１６０２に保持される情報には、推定モデル保持部１０７と同様、エネルギー推定モデルを構成するニューラルネットワークが有する層の数及び各層に配置されるユニット数を表す構成情報と、各ユニットでの荷重和計算に用いられる荷重値を表す荷重Ｗとが含まれる。加えて、エネルギー推定モデル学習部１６０１によって算出された既知組成式の形成エネルギーの平均及び分散もエネルギー推定モデル保持部１６０２に保持される。 The neural network whose weight W has been adjusted is held in the energy estimation model holding unit 1602 as an energy estimation model. The information held in the energy estimation model holding unit 1602 includes configuration information indicating the number of layers and the number of units arranged in each layer of the neural network that constitutes the energy estimation model, as in the estimation model holding unit 107. and a load W representing the load value used in the load sum calculation in the unit. In addition, the energy estimation model storage unit 1602 also stores the average and variance of the formation energy of the known composition formula calculated by the energy estimation model learning unit 1601 .

教師ラベル付与部１０５は、実施の形態１の手法を用いて疑似組成式に教師ラベルＬを付与すると共に、エネルギー推定モデルにより推定された疑似組成式の形成エネルギーを用いて教師ラベルＬを補正する。 The teacher label assigning unit 105 assigns the teacher label L to the pseudo compositional formula using the method of the first embodiment, and corrects the teacher label L using the formation energy of the pseudo compositional formula estimated by the energy estimation model. .

具体的には、まず、教師ラベル付与部１０５は、実施の形態１の方法で疑似組成式保持部１０４に保持された各疑似組成式に教師ラベルＬを付与する。次に、教師ラベル付与部１０５は、エネルギー推定モデル保持部１６０２からエネルギー形成モデルを取得する。次に、教師ラベル付与部１０５は、疑似組成式保持部１０４に保持された各疑似組成式の特徴量及び記述子をエネルギー形成モデルに入力することで、各疑似組成式の形成エネルギーを得る。次に、教師ラベル付与部１０５は、エネルギー推定モデル保持部１６０２に保持されている既知組成式の形成エネルギーの平均及び分散を読み込む。次に、教師ラベル付与部１０５は、読み込んだ平均及び分散と、各疑似組成式の形成エネルギーとを用いて、各疑似組成式のマハラノビス距離Ｍを求める。 Specifically, first, the teacher label assigning unit 105 assigns the teacher label L to each pseudo compositional formula held in the pseudo compositional formula holding unit 104 by the method of the first embodiment. Next, the teacher label assigning unit 105 acquires the energy formation model from the energy estimation model holding unit 1602 . Next, the teacher label assigning unit 105 obtains the formation energy of each pseudo composition formula by inputting the feature amount and descriptor of each pseudo composition formula held in the pseudo composition formula holding unit 104 into the energy formation model. Next, the teacher label assigning unit 105 reads the average and variance of the formation energy of the known composition formula held in the energy estimation model holding unit 1602 . Next, the teacher label assigning unit 105 obtains the Mahalanobis distance M of each pseudo composition formula using the read mean and variance and the formation energy of each pseudo composition formula.

次に、教師ラベル付与部１０５は、疑似組成式に付与されている教師ラベルＬに対してＬ’＝ｅｘｐ（－Ｍ）×Ｌの演算を行い、教師ラベルＬを補正する。Ｌ’は補正後の教師ラベルを表す。これは、学習回数に基づいて得られた教師ラベルＬに形成エネルギーに応じた重み付けを行ったことを意味する。これにより、教師ラベルＬは、マハラノビス距離Ｍが増大するにつれて値が小さくなるように補正されるのである。 Next, the teacher label assigning unit 105 corrects the teacher label L by calculating L′=exp(−M)×L on the teacher label L assigned to the pseudo composition formula. L' represents the teacher label after correction. This means that the teacher label L obtained based on the number of times of learning is weighted according to the formation energy. As a result, the teacher label L is corrected so that its value decreases as the Mahalanobis distance M increases.

上述したように、形成エネルギーが値が高すぎる又は低すぎると、物質は安定しない。そのため、既知組成式の形成エネルギーに対してかけ離れた形成エネルギーを持つ疑似組成式は合成可能性が低いと言える。そこで、本実施の形態では、既知組成式の形成エネルギーの集団に対して疑似組成式の形成エネルギーがどの程度離れているかを示すマハラノビス距離Ｍを用いて教師ラベルＬを補正する。これにより、本実施の形態では、既知組成式に対して物理学の知見を踏まえた教師ラベルＬ’を付与することができる。その結果、本実施の形態は、合成可能性が高精度に推定できる推定モデルを生成することができる。 As mentioned above, if the energy of formation is too high or too low, the material will not be stable. Therefore, it can be said that the possibility of synthesizing a pseudo-composition formula having a formation energy far away from that of a known composition formula is low. Therefore, in the present embodiment, the teacher label L is corrected using the Mahalanobis distance M, which indicates how far the formation energy of the pseudo-composition formula is separated from the group of formation energies of the known composition formula. Thus, in the present embodiment, a teacher label L' based on physics knowledge can be assigned to a known composition formula. As a result, the present embodiment can generate an estimation model that can estimate the synthesis possibility with high accuracy.

なお、本実施の形態では距離としてマハラノビス距離Ｍが採用されたがこれは一例であり、既知組成式の形成エネルギーの平均に対するユークリッド距離等、既知組成式の形成エネルギーの集団に対する距離が測定可能な距離であればどのような距離が採用されてもよい。 In the present embodiment, the Mahalanobis distance M is used as the distance, but this is an example, and the distance to the group of formation energies of the known composition formula, such as the Euclidean distance to the average of the formation energies of the known composition formula, can be measured. Any distance may be adopted as long as it is the distance.

（実施の形態４）
実施の形態４に係る材料情報出力システムは、特性値の範囲をユーザが入力すると、その範囲内の特性値を持つ疑似組成式を合成可能性と合わせてユーザに提示するものである。 (Embodiment 4)
The material information output system according to Embodiment 4 presents to the user, when the user inputs the range of characteristic values, pseudo-composition formulas having characteristic values within that range together with the possibility of synthesis.

図２０は、本開示の実施の形態４に係る材料情報出力システムの構成の一例を示すブロック図である。なお、本実施の形態において実施の形態１～３と同一の構成要素には同一の符号を付し説明を省略する。 FIG. 20 is a block diagram showing an example of a configuration of a material information output system according to Embodiment 4 of the present disclosure. In this embodiment, the same components as in Embodiments 1 to 3 are denoted by the same reference numerals, and descriptions thereof are omitted.

図２０において、図１との相違点は、更に、特性値推定モデル保持部１７０１と、生成モデル保持部１７０２と、特性値算出部１７０３とが材料情報出力装置１００に設けられている点にある。 20, the difference from FIG. 1 is that a characteristic value estimation model holding unit 1701, a generative model holding unit 1702, and a characteristic value calculation unit 1703 are further provided in the material information output device 100. .

実施の形態１で示したように、生成モデル学習部１０２の学習では、生成部８０２と識別部８０３とによる敵対的学習アルゴリズムが用いられている。学習の結果得られた生成部８０２のニューラルネットワークＮＮ１は、特性値と乱数の発生数とが設定されると、設定された特性値を持つことが推定される疑似組成式を、設定した乱数の発生数だけ生成できる。また、識別部８０３のニューラルネットワークＮＮ２は、疑似組成式の特徴量及び記述子が入力されると、その疑似組成式の特性値の推定結果を出力できる。 As shown in Embodiment 1, the learning of the generative model learning unit 102 uses an adversarial learning algorithm by the generating unit 802 and the identifying unit 803 . When the characteristic value and the number of random numbers generated are set, the neural network NN1 of the generation unit 802 obtained as a result of learning generates a pseudo-composition formula that is estimated to have the set characteristic value. Only the number of occurrences can be generated. Further, when the feature quantity and the descriptor of the pseudo composition formula are input, the neural network NN2 of the identification unit 803 can output the estimation result of the characteristic value of the pseudo composition formula.

そこで、本実施の形態では、学習済みのニューラルネットワークＮＮ１と学習済みのニューラルネットワークＮＮ２とを活用過程で利用する。 Therefore, in the present embodiment, the trained neural network NN1 and the trained neural network NN2 are used in the utilization process.

すなわち、特性値推定モデル保持部１７０１は、学習済みのニューラルネットワークＮＮ２を特性値推定モデルとして保持する。生成モデル保持部１７０２は、学習済みのニューラルネットワークＮＮ１を生成モデルとして保持する。 That is, the characteristic value estimation model holding unit 1701 holds the trained neural network NN2 as a characteristic value estimation model. The generative model holding unit 1702 holds the trained neural network NN1 as a generative model.

活用過程において、入力部１０３は、組成式ではなく、特性値の範囲がユーザにより入力される。特性値は、例えば、電池の充放電率又は容量等である。或いは、特性値は、ヤング率又はイオン化ポテンシャル等であってもよい。ユーザである材料開発者は、所望の特性値の範囲を満足する組成式を探索したい場合ある。そこで、本実施の形態では、ユーザに入力部１０３を用いて特性値の範囲を入力させる。加えて、ユーザに入力部１０３を用いて生成したい疑似組成式の生成数Ｎを入力させる。ここで生成数Ｎは、表示部１１０での視認が容易な数であり、例えば１０や１００である。 In the utilization process, the input unit 103 is input by the user not with the compositional formula but with the range of characteristic values. The characteristic value is, for example, the charge/discharge rate or capacity of the battery. Alternatively, the characteristic value may be Young's modulus, ionization potential, or the like. A material developer who is a user may want to search for a composition formula that satisfies a desired range of characteristic values. Therefore, in the present embodiment, the user is made to input the range of characteristic values using the input unit 103 . In addition, the user is made to input the number N of pseudo composition formulas to be generated using the input unit 103 . Here, the generated number N is a number that can be easily visually recognized on the display unit 110, and is 10 or 100, for example.

記述子算出部１０８は、生成モデル保持部１７０２から生成モデルを取得し、その生成モデルに対して特性値を入力し、特徴量及び記述子を算出する。 The descriptor calculation unit 108 acquires a generative model from the generative model holding unit 1702, inputs characteristic values to the generative model, and calculates feature amounts and descriptors.

具体的には、記述子算出部１０８は、ユーザが入力した特性値の範囲内の特性値と乱数とのペアからなる入力データＸ＝［乱数、特性値］をＮ個生成する。ここで、Ｎ個の入力データＸは、Ｘ１＝［乱数Ａ１、特性値Ｂ１］、Ｘ２＝［乱数Ａ２、特性値Ｂ２］、・・・、ＸＮ＝［乱数ＡＮ、特性値ＢＮ］と表される。 Specifically, the descriptor calculation unit 108 generates N pieces of input data X=[random number, characteristic value] consisting of a pair of a characteristic value and a random number within the characteristic value range input by the user. Here, the N pieces of input data X are expressed as X1=[random number A1, characteristic value B1], X2=[random number A2, characteristic value B2], . . . , XN=[random number AN, characteristic value BN]. be.

そして、記述子算出部１０８は、生成モデルにＮ個の入力データＸを順次入力し、特徴量及び記述子を表すＮ個の出力データＹ、すなわち、Ｎ個の疑似組成式を得る。ここで、記述子算出部１０８は、ユーザが入力部１０３に入力した特性値の範囲内からランダムにＮ個の値をサンプリングすることでＮ個の特性値を生成すればよい。 Then, the descriptor calculation unit 108 sequentially inputs N pieces of input data X to the generative model, and obtains N pieces of output data Y representing feature amounts and descriptors, that is, N pieces of pseudo compositional formulas. Here, the descriptor calculation unit 108 may generate N characteristic values by randomly sampling N values from within the range of characteristic values input by the user to the input unit 103 .

特性値算出部１７０３は、特性値推定モデル保持部１７０１に保持されている特性値推定モデルを取得する。そして、特性値算出部１７０３は、記述子算出部１０８で得られたＮ個の特徴量及び記述子を入力データＸとして特性値推定モデルに入力し、Ｎ個の疑似組成式に対する特性値の推定結果を表すＮ個の出力データＹを算出する。 A characteristic value calculation unit 1703 acquires the characteristic value estimation model held in the characteristic value estimation model holding unit 1701 . Then, the characteristic value calculation unit 1703 inputs the N feature values and descriptors obtained by the descriptor calculation unit 108 to the characteristic value estimation model as input data X, and estimates characteristic values for the N pseudo composition formulas. Calculate N output data Y representing the result.

合成可能性算出部１０９は、推定モデル保持部１０７から推定モデルを取得し、記述子算出部１０８が算出したＮ個の特徴量及び記述子を推定モデルに入力し、Ｎ個の疑似組成式に対する合成可能性を算出する。 Synthesis possibility calculation unit 109 acquires an estimated model from estimated model holding unit 107, inputs N feature amounts and descriptors calculated by descriptor calculation unit 108 to the estimated model, and calculates N pseudo composition formulas. Calculate the synthesizability.

表示部１１０は、合成可能性を示す第１軸と、特性値を示す第２軸とを備えるグラフ上に、Ｎ個の疑似組成式についての合成可能性と特性値とがプロットされた散布図を生成して表示する。 The display unit 110 is a scatter diagram in which the synthesizing possibilities and the characteristic values of the N pseudo-composition formulas are plotted on a graph having a first axis indicating the synthesizing possibilities and a second axis indicating the characteristic values. is generated and displayed.

図２１は、表示部１１０が表示する散布図Ｇ２の一例を示す図である。散布図Ｇ２は、縦軸を特性値、横軸を合成可能性とする２軸のグラフ１８０１を備える。グラフ１８０１の下には、ユーザが入力部１０３を用いて入力した特性値の範囲と、疑似組成式の生成数とが表示されている。ここでは、特性値として５０～５００、生成数として６が入力されている。 FIG. 21 is a diagram showing an example of a scatter diagram G2 displayed by display unit 110. As shown in FIG. The scatter diagram G2 has a two-axis graph 1801 with characteristic values on the vertical axis and synthesis possibilities on the horizontal axis. Below the graph 1801, the range of characteristic values input by the user using the input unit 103 and the number of generated pseudo-composition formulas are displayed. Here, 50 to 500 are input as characteristic values and 6 as the number of generations.

グラフ１８０１には、生成された６個の疑似組成式を示す６個の点ＰＰ１がプロットされている。また、各点ＰＰ１の近傍には、特徴量及び記述子から算出される組成式が表示されている。実施の形態１でも示したように、特徴量及び記述子から組成式への変換は容易である。この表示によって、材料開発者は、狙いとする特性値の範囲内でどのような組成式が生成可能であるかを容易に認識できる。 Graph 1801 plots six points PP1 representing the six pseudo-composition formulas generated. Also, near each point PP1, a composition formula calculated from the feature amount and the descriptor is displayed. As shown in the first embodiment, it is easy to convert the feature values and descriptors into the composition formula. This display allows material developers to easily recognize what kind of compositional formula can be generated within the range of targeted characteristic values.

図２２は、本実施の形態４に係る材料情報出力システムの活用過程における処理の一例を示すフローチャートである。 FIG. 22 is a flow chart showing an example of processing in the utilization process of the material information output system according to the fourth embodiment.

Ｓ４０１では、入力部１０３は、ユーザにより、特性値の範囲と生成したい疑似組成式の生成数Ｎとが入力される。Ｓ４０２では、記述子算出部１０８は、入力された特性値の範囲から特性値と乱数とのペアからなるＮ個の入力データＸを生成する。Ｓ４０３では、記述子算出部１０８は、生成モデル保持部１７０２から生成モデルを取得し、生成モデルにＮ個の入力データＸ＝［乱数、特性値］を入力し、Ｎ個の疑似組成式を生成する。 In S401, the user inputs the range of characteristic values and the number N of pseudo compositional formulas to be generated to the input unit 103 . In S402, the descriptor calculation unit 108 generates N pieces of input data X consisting of pairs of characteristic values and random numbers from the input characteristic value range. In S403, the descriptor calculation unit 108 acquires the generative model from the generative model holding unit 1702, inputs N pieces of input data X=[random numbers, characteristic values] to the generative model, and generates N pseudo composition formulas. do.

Ｓ４０４では、特性値算出部１７０３は、特性値推定モデル保持部１７０１から特性値推定モデルを取得し、Ｎ個の疑似組成式を特性値推定モデルに入力し、Ｎ個の特性値を算出する。 In S404, the characteristic value calculation unit 1703 acquires the characteristic value estimation model from the characteristic value estimation model holding unit 1701, inputs N pseudo composition formulas to the characteristic value estimation model, and calculates N characteristic values.

Ｓ４０４と同時に、Ｓ４０５では、合成可能性算出部１０９は、推定モデル保持部１０７から推定モデルを取得し、推定モデルにＮ個の疑似組成式を入力し、Ｎ個の疑似組成式のそれぞれに対する合成可能性を算出する。 Simultaneously with S404, in S405, the synthesizing possibility calculation unit 109 acquires an estimated model from the estimated model holding unit 107, inputs N pseudo composition formulas to the estimated model, and performs synthesis for each of the N pseudo composition formulas. Calculate probabilities.

Ｓ４０６では、表示部１１０はＳ４０３で算出された疑似組成式の特徴量及び記述子から対応する組成式を算出し、Ｎ個の疑似組成式をグラフ１８０１にプロットすると共に、プロットした各点ＰＰ１の近傍に対応する組成式がグラフ１８０１上に表示された散布図Ｇ２を生成して表示する。 In S406, the display unit 110 calculates the corresponding composition formula from the feature amount and descriptor of the pseudo composition formula calculated in S403, plots the N pseudo composition formulas on the graph 1801, and plots the plotted points PP1. A scatter diagram G2 in which composition formulas corresponding to the neighborhood are displayed on the graph 1801 is generated and displayed.

以上説明したように、実施の形態４に係る材料情報出力システムによれば、ユーザが狙いとする特性値の範囲を入力すると、その範囲内の特性値を持つ疑似組成式が生成され、その疑似組成式がグラフ１８０１上にプロットされた散布図Ｇ２が生成される。そのため、本実施の形態は、材料開発者の解析を補助するための有用な判断材料を提示できる。 As described above, according to the material information output system according to Embodiment 4, when the user inputs the target characteristic value range, a pseudo composition formula having characteristic values within the range is generated, and the pseudo composition formula is generated. A scatter diagram G2 is generated in which the compositional formula is plotted on the graph 1801 . Therefore, the present embodiment can present useful judgment materials for assisting the material developer's analysis.

なお、実施の形態４では、グラフ１８０１は２軸で構成されているがこれは一例であり、特性値及び合成可能性以外に表示したいパラメータがあれば、３軸以上のグラフで構成されてもよい。 In the fourth embodiment, the graph 1801 is composed of two axes, but this is just an example. If there are parameters to be displayed other than characteristic values and synthesizing possibilities, the graph may be composed of three or more axes. good.

（ハードウェア構成）
材料情報出力システムは、コンピュータを利用して実現される。図２３は、材料情報出力システムを実現するためのハードウェア構成の一例を示すブロック図である。材料情報出力システムは、コンピュータ２０００と、コンピュータ２０００に指示を与えるためのキーボード２０１１及びマウス２０１２と、コンピュータ２０００の演算結果等の情報を提示するためのディスプレイ２０１０と、コンピュータ２０００で実行されるプログラムを読み取るためのＯＤＤ（ＯｐｔｉｃａｌＤｉｓｋＤｒｉｖｅ）２００８とを含む。 (Hardware configuration)
A material information output system is realized using a computer. FIG. 23 is a block diagram showing an example of hardware configuration for realizing the material information output system. The material information output system comprises a computer 2000, a keyboard 2011 and a mouse 2012 for giving instructions to the computer 2000, a display 2010 for presenting information such as calculation results of the computer 2000, and a program executed by the computer 2000. ODD (Optical Disk Drive) 2008 for reading.

材料情報出力システムが実行するプログラムは、コンピュータで読み取り可能な光記憶媒体２００９に記憶され、ＯＤＤ２００８で読み取られる。または、コンピュータネットワークを通じてＮＩＣ２００６で読み取られる。 A program executed by the material information output system is stored in a computer-readable optical storage medium 2009 and read by the ODD 2008 . Or read by NIC 2006 through a computer network.

コンピュータ２０００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２００１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２００４と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２００３と、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）２００５と、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣｏｎｔｒｏｌｌｅｒ）２００６と、バス２００７とを含む。 A computer 2000 includes a CPU (Central Processing Unit) 2001, a ROM (Read Only Memory) 2004, a RAM (Random Access Memory) 2003, a HDD (Hard Disk Drive) 2005, and a NIC (Network Interface 6 and bus controller) 2000. 2007.

更に、コンピュータ２０００は、高速演算を行うためのＧＰＵ（ＧｒａｐｈｉｃａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２００２を含んでも良い。 Furthermore, the computer 2000 may include a GPU (Graphical Processing Unit) 2002 for performing high-speed calculations.

ＣＰＵ２００１とＧＰＵ２００２とは、ＯＤＤ２００８またはＮＩＣ２００６を介して読み取られたプログラムを実行する。ＲＯＭ２００４は、コンピュータ２０００の動作に必要なプログラム及びデータを記憶する。ＲＡＭ２００３は、プログラム実行時のパラメータなどのデータを記憶する。ＨＤＤ２００５は、プログラム及びデータなどを記憶する。ＮＩＣ２００６は、コンピュータネットワークを介して他のコンピュータとの通信を行う。バス２００７は、ＣＰＵ２００１、ＲＯＭ２００４、ＲＡＭ２００３、ＨＤＤ２００５、ＮＩＣ２００６、ディスプレイ２０１０、キーボード２０１１、マウス２０１２及びＯＤＤ２００８を相互に接続する。 CPU2001 and GPU2002 run the program read via ODD2008 or NIC2006. ROM 2004 stores programs and data necessary for the operation of computer 2000 . A RAM 2003 stores data such as parameters for program execution. The HDD 2005 stores programs, data, and the like. NIC 2006 communicates with other computers via a computer network. A bus 2007 connects the CPU 2001, ROM 2004, RAM 2003, HDD 2005, NIC 2006, display 2010, keyboard 2011, mouse 2012 and ODD 2008 to each other.

尚、コンピュータ２０００に接続されているキーボード２１１１、マウス２０１２、及びＯＤＤ２００８は、例えばディスプレイ２０１０がタッチパネルになっている場合又はＮＩＣ２００６を利用する場合には、取り外しても良い。 Note that the keyboard 2111, mouse 2012, and ODD 2008 connected to the computer 2000 may be removed if the display 2010 is a touch panel or if the NIC 2006 is used.

図２３に示すコンピュータ２０００は、図１に示す材料情報出力装置１００と上述した第１コンピュータと、第２コンピュータとのそれぞれを構成してもよい。 The computer 2000 shown in FIG. 23 may constitute each of the material information output device 100 shown in FIG. 1, the above-described first computer, and the second computer.

更に、上記の各装置を構成する材料情報出力システムの構成要素の一部または全ては、１個のシステムＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ：大規模集積回路）から構成されてもよい。システムＬＳＩは、複数の構成部を１個のチップ上に蓄積して製造された超多機能ＬＳＩであり、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどを含んで構成されるコンピュータシステムである。ＲＡＭには、コンピュータプログラムが記憶されている。マイクロプロセッサが、コンピュータプログラムに従って動作することにより、システムＬＳＩは、その機能を達成する。 Furthermore, some or all of the constituent elements of the material information output system that constitute each of the devices described above may be configured from one system LSI (Large Scale Integration). A system LSI is an ultra-multifunctional LSI manufactured by accumulating a plurality of components on a single chip. Specifically, it is a computer system that includes a microprocessor, ROM, RAM, etc. . A computer program is stored in the RAM. The system LSI achieves its functions by the microprocessor operating according to the computer program.

また、上記の各装置を構成する構成要素の一部または全ては、各装置に着脱可能なＩＣカードまたは単体モジュールから構成されてもよい。ＩＣカードまたはモジュールは、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどから構成されるコンピュータシステムである。ＩＣカードまたはモジュールは、上記の超多機能ＬＳＩを含むとしても良い。マイクロプロセッサが、コンピュータプログラムに従って動作することにより、ＩＣカードまたはモジュールは、その機能を達成する。このＩＣカードまたはこのモジュールは、耐タンパ性を有するとしても良い。 Moreover, some or all of the components constituting each device described above may be configured from an IC card or a single module that can be attached to and detached from each device. An IC card or module is a computer system that consists of a microprocessor, ROM, RAM, and so on. The IC card or module may include the super-multifunctional LSI described above. The IC card or module achieves its function by the microprocessor operating according to the computer program. This IC card or this module may have tamper resistance.

また、本開示は、上記に示す材料情報出力システムの処理を実行する方法であるとしても良い。また、これらの方法をコンピュータにより実現するコンピュータプログラムを含んでも良いし、前記コンピュータプログラムからなるデジタル信号を含んでも良い。 The present disclosure may also be a method for executing the processing of the material information output system described above. Moreover, a computer program for implementing these methods by a computer may be included, or a digital signal composed of the computer program may be included.

更に、本開示は、上記コンピュータプログラム又は上記デジタル信号をコンピュータ読み取り可能な非一時的な記憶媒体、例えば、フレキシブルディスク、ハードディスク、ＣＤ－ＲＯＭ、ＭＯ、ＤＶＤ、ＤＶＤ－ＲＯＭ、ＤＶＤ－ＲＡＭ、ＢＤ（Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ）、半導体メモリなどに記憶したものを含んでも良い。また、これら非一時的な記憶媒体に記録されている上記デジタル信号を含んでも良い。 Furthermore, the present disclosure provides a computer-readable non-transitory storage medium for the computer program or the digital signal, such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD ( Blu-ray (registered trademark) Disc), semiconductor memory, or the like may be included. Moreover, the digital signal recorded in these non-temporary storage media may also be included.

また、本発明は、上記コンピュータプログラムまたは上記デジタル信号を、電気通信回線、無線または有線通信回路、インターネットを代表とするネットワーク、データ放送等を経由して伝送するものとしても良い。 Further, according to the present invention, the computer program or the digital signal may be transmitted via an electric communication line, a wireless or wired communication circuit, a network represented by the Internet, data broadcasting, or the like.

また、上記プログラムまたは上記デジタル信号を上記非一時的な記憶媒体に記録して移送することにより、または上記プログラムまたは上記デジタル信号は上記ネットワーク等を経由して移送することにより、独立した他のコンピュータシステムにより実施するとしても良い。 In addition, by recording the program or the digital signal in the non-temporary storage medium and transferring it, or by transferring the program or the digital signal via the network or the like, another independent computer It may be implemented by the system.

図２４は、本開示の材料情報出力システムのハードウェア構成の他の一例を示す図である。図２４の例では、コンピュータ２０００と、コンピュータ２０００とネットワークを介して接続されたデータサーバ２１０１とを備えている。データサーバ２１０１は、データを記憶するメモリを備える。このデータは、上記ネットワーク等を経由してコンピュータ２０００に読み出される。また、データサーバ２１０１から情報を読み出す、コンピュータ２０００は１台である必要はなく、複数であっても良い。その際、各コンピュータ２００が材料情報出力システムの構成要素の一部をそれぞれ実施しても良い。 FIG. 24 is a diagram showing another example of the hardware configuration of the material information output system of the present disclosure; The example of FIG. 24 includes a computer 2000 and a data server 2101 connected to the computer 2000 via a network. The data server 2101 has memory for storing data. This data is read out to the computer 2000 via the network or the like. Also, the number of computers 2000 that read information from the data server 2101 does not have to be one, and may be plural. At that time, each computer 200 may implement a part of the components of the material information output system.

図２４のハードウェア構成を採用する場合、推定モデル保持部１０７、疑似組成式保持部１０４、既知組成式リスト１０１をデータサーバ２１０１に設ければよい。また、図２４のハードウェア構成を採用する場合、図１等に示す各ブロックは、データサーバ２１０１及びコンピュータ２０００に分散して実装されてもよい。 When adopting the hardware configuration of FIG. 24, the estimated model storage unit 107, the pseudo composition formula storage unit 104, and the known composition formula list 101 may be provided in the data server 2101. FIG. Moreover, when adopting the hardware configuration of FIG. 24, each block shown in FIG.

更に、上記実施の形態および上記変形例をそれぞれ組み合わせるとしても良い。 Furthermore, the above embodiments and modifications may be combined.

今回開示された実施の形態は全ての点で例示であって制限的なものではないと考えられるべきである。本開示の範囲は上記した説明ではなく、請求の範囲によって示され、請求の範囲と均等の意味および範囲内での全ての変更が含まれることが意図される。 It should be considered that the embodiments disclosed this time are illustrative in all respects and not restrictive. The scope of the present disclosure is indicated by the scope of claims rather than the above description, and is intended to include all changes within the scope and meaning equivalent to the scope of the claims.

本発明によると、未知の組成式に対する適切な合成可能性を提示できるので、新規材料開発の分野にとって有用である。 INDUSTRIAL APPLICABILITY According to the present invention, suitable synthesizing possibilities for unknown composition formulas can be presented, which is useful in the field of new material development.

１００：材料情報出力装置
１０１：既知組成式リスト
１０２：生成モデル学習部
１０３：入力部
１０４：疑似組成式保持部
１０５：教師ラベル付与部
１０６：推定モデル学習部
１０７：推定モデル保持部
１０８：記述子算出部
１０９：合成可能性算出部
１１０：表示部
８０１：乱数発生部
８０２：生成部
８０３：識別部
１３０１：組み合わせ算出部
１３０２：組み合わせ保持部
１６０１：エネルギー推定モデル学習部
１６０２：エネルギー推定モデル保持部
１７０１：特性値推定モデル保持部
１７０２：生成モデル保持部
１７０３：特性値算出部 100: Material information output device 101: Known composition formula list 102: Generative model learning unit 103: Input unit 104: Pseudo composition formula holding unit 105: Teacher label assigning unit 106: Estimation model learning unit 107: Estimation model holding unit 108: Description child calculation unit 109 : synthesis possibility calculation unit 110 : display unit 801 : random number generation unit 802 : generation unit 803 : identification unit 1301 : combination calculation unit 1302 : combination storage unit 1601 : energy estimation model learning unit 1602 : energy estimation model storage Unit 1701: Characteristic value estimation model holding unit 1702: Generative model holding unit 1703: Characteristic value calculation unit

Claims

A material information output method in a material information output device that outputs information about materials,
The material information output device has a memory for pre-storing pseudo compositional formulas generated in the learning process by a pseudo compositional formula generation model generated by learning a list of known compositional formulas in association with numerical values indicating learning stages. prepared,
The processor of the material information output device,
calculating the synthesizability of the pseudo composition formula stored in the memory using the corresponding numerical values, and assigning the calculated synthesizability as a teacher label to the pseudo composition formula stored in the memory;
generating an estimation model for estimating the synthesis possibility by learning the pseudo-composition formula to which the teacher label is assigned;
calculating the synthesis possibility for an arbitrary composition formula using the learned estimation model;
outputting the calculated combinability;
Material information output method.

Further, identifying a first known composition formula of the same type as the material to be estimated from among the known composition formulas included in the known composition formula list,
voting the first known compositional formula that matches the combination for a combination of a predetermined group of elements, holding the voting result in the memory;
assigning the supervised label indicating that the synthesizing possibility is 0 to the pseudo-composition formula whose combination matches the combination for which the voting result has not yet been voted, among the pseudo-composition formulas;
The material information output method according to claim 1.

The known composition formula list further includes formation energies of known composition formulas,
obtaining from the memory the energy estimation model learned using the known composition formula list and the average and variance of formation energies included in the known composition formula list;
The formation energy of the pseudo composition formula is estimated using the energy estimation model, the distance of the formation energy of the pseudo composition formula to the formation energy of the known composition formula is calculated using the average and the variance, and the distance correcting the teacher label so that the value of the teacher label decreases as the
The material information output method according to claim 1 or 2.

The memory further holds a characteristic value estimation model that outputs characteristic values for the pseudo composition formula,
get the characteristic value range entered by the user;
inputting a characteristic value within the characteristic value range into the generative model to generate the pseudo composition formula;
estimating the characteristic values of the generated pseudo composition formula using a characteristic value estimation model;
On a graph having at least two axes, a first axis indicating the synthesizability and a second axis indicating the characteristic value, the synthesizability and the estimated value of the characteristic value for the generated pseudo-composition formula are show the plotted scatterplot on the display,
The material information output method according to any one of claims 1 to 3.

The known composition list is composed of ICSD, materials project, or NOMAD databases,
The material information output method according to any one of claims 1 to 4.

The memory further holds an identification model that identifies whether the pseudo compositional formula generated by the generative model is genuine or fake,
The generative model and the discriminative model are composed of a neural network trained based on an adversarial learning algorithm,
The material information output method according to any one of claims 1 to 5.

In assigning the teacher label, assigning a teacher label only to the pseudo composition formula generated at the beginning of the learning process and the pseudo composition formula generated at the end of the learning process, and holding the teacher label in the memory;
The pseudo-composition formula generated at the beginning of the learning stage is given a numerical value indicating that the synthesis possibility is 0,
The pseudo-composition formula generated at the end of the learning is given a numerical value indicating that the synthesis possibility is 1;
The material information output method according to any one of claims 1 to 6.

The teacher label is assigned a numerical ratio indicating the learning stage with respect to the total number of learning times of the generative model.
The material information output method according to any one of claims 1 to 6.

A material information output device that outputs information about materials,
a first memory for pre-storing pseudo composition formulas generated by learning a list of known composition formulas generated by the model in the learning process in association with numerical values indicating learning stages;
calculating the synthesis possibility of the pseudo composition formula stored in the first memory using the corresponding numerical values, and using the calculated synthesis possibility as a teacher label of the pseudo composition formula stored in the first memory; a teacher label assigning unit to assign
an estimation model learning unit that generates an estimation model for estimating the possibility of synthesizing the pseudo composition formula by learning the pseudo composition formula to which the teacher label is assigned;
a synthesizing possibility calculation unit that calculates the synthesizing possibility for an arbitrary composition formula using the estimated model;
and an output unit that outputs the synthesis possibility.

a material information output device according to claim 9;
a second memory for pre-storing the known composition formula;
a generative model learning unit that generates the generative model by learning the known composition formula stored in the second memory;
and an input unit that receives input of the arbitrary composition formula.

A program that causes a computer to function as the material information output device according to claim 9 .