JP7171477B2

JP7171477B2 - Information processing device, information processing method and information processing program

Info

Publication number: JP7171477B2
Application number: JP2019047310A
Authority: JP
Inventors: 健一磯
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2019-03-14
Filing date: 2019-03-14
Publication date: 2022-11-15
Anticipated expiration: 2039-03-14
Also published as: JP2020149460A

Description

特許法第３０条第２項適用投稿論文一般社団法人日本音響学会２０１９年春季研究発表会概要ｈｔｔｐ：／／ｗｗｗ．ａｓｊ．ｇｒ．ｊｐ／ａｎｎｕａｌｍｅｅｔｉｎｇ／ｉｎｄｅｘ．ｈｔｍｌApplication of Article 30, Paragraph 2 of the Patent Act Submitted Paper Overview of the 2019 Spring Research Presentation Meeting of the Acoustical Society of Japan http://www. asj. gr. jp/annualmeeting/index. html

本発明は、情報処理装置、情報処理方法および情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program.

従来、ＤＮＮ（Deep Neural Network）といったモデルを用いた各種の分類処理が実現されている。このようなＤＮＮを用いた分類処理を実現するため、統計的勾配降下法(Stochastic Gradient Descent, SGD)を用いたモデルの学習方法が知られている。例えば、ＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）といった演算装置に対してそれぞれ異なる学習データを配布し、配布された学習データを用いて各演算装置にモデルの学習を行わせ、各演算装置の学習結果を同期させる処理を繰り返し行う技術が知られている。また、各演算装置における学習結果の同期処理のコストを軽減するため、各演算装置に学習処理を複数回行わせてから同期を行う技術が知られている。 Conventionally, various classification processes using models such as DNNs (Deep Neural Networks) have been implemented. A method of learning a model using the stochastic gradient descent (SGD) method is known in order to realize classification processing using such a DNN. For example, different learning data is distributed to each arithmetic unit such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit), and each arithmetic unit learns a model using the distributed learning data. A technique for repeatedly performing processing for synchronizing learning results of devices is known. Also, in order to reduce the cost of synchronizing processing of learning results in each arithmetic unit, a technique is known in which synchronization is performed after each arithmetic unit performs learning processing a plurality of times.

“Experiments on Parallel Training of Deep Neural Network using Model Averaging”, Hang Su, Haoyu Chen, インターネット< https://arxiv.org/abs/1507.01239>（平成３１年３月１日検索）“Experiments on Parallel Training of Deep Neural Network using Model Averaging”, Hang Su, Haoyu Chen, Internet < https://arxiv.org/abs/1507.01239> (searched March 1, 2019)

しかしながら、上述した技術では、モデルの精度を向上させる余地がある。 However, the technique described above leaves room for improving the accuracy of the model.

例えば、上述した技術では、各演算装置がそれぞれ異なる学習データを用いて学習を行ったモデルを同期させるため、最終的に得られるモデルは、全ての学習データを用いて学習が行われたモデルの近似に過ぎなくなる。また、上述した技術では、同期処理の回数が十分ではない場合に、学習データが有する特徴を適切に学習することができなくなる恐れがある。 For example, in the above-described technology, each arithmetic unit synchronizes models trained using different learning data, so the model finally obtained is the same as the model trained using all the learning data. It becomes only an approximation. Moreover, with the above-described technology, there is a possibility that the characteristics of the learning data cannot be learned appropriately if the number of times of synchronization processing is not sufficient.

本願は、上記に鑑みてなされたものであって、複数の演算装置を用いたモデルの学習精度を改善することを目的とする。 The present application has been made in view of the above, and an object thereof is to improve the learning accuracy of a model using a plurality of arithmetic units.

本願に係る情報処理装置は、配布された学習データを用いてそれぞれ個別にモデルの学習を行う複数の演算装置に対し、それぞれ異なる学習データを配布する配布部と、配布された学習データを用いて各演算装置が実行した学習の結果に応じた態様で、各演算装置が学習したモデルの同期を行う同期部とを有することを特徴とする。 An information processing device according to the present application includes a distribution unit that distributes different learning data to a plurality of computing devices that individually perform model learning using the distributed learning data, and a distribution unit that distributes the distributed learning data. and a synchronizing unit for synchronizing the model learned by each arithmetic unit in a mode according to the result of learning executed by each arithmetic unit.

実施形態の一態様によれば、複数の演算装置を用いたモデルの学習精度を改善することができる。 According to one aspect of the embodiments, it is possible to improve the learning accuracy of a model using a plurality of computing devices.

図１は、実施形態に係る情報提供装置が実行する処理の一例を示す図である。FIG. 1 is a diagram illustrating an example of processing executed by an information providing device according to an embodiment. 図２は、実施形態に係る情報提供装置の構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example of an information providing device according to the embodiment; 図３は、実施形態に係る第２演算部が有する機能構成の一例を示す図である。3 is a diagram illustrating an example of a functional configuration of a second calculation unit according to the embodiment; FIG. 図４は、実施形態に係る第２演算部が有する機能構成の一例を示す図である。4 is a diagram illustrating an example of a functional configuration of a second calculation unit according to the embodiment; FIG. 図５は、ハードウェア構成の一例を示す図である。FIG. 5 is a diagram illustrating an example of a hardware configuration;

以下に、本願に係る情報処理装置、情報処理方法および情報処理プログラムを実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る情報処理装置、情報処理方法および情報処理プログラムが限定されるものではない。また、各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略される。 Embodiments for implementing an information processing apparatus, an information processing method, and an information processing program according to the present application (hereinafter referred to as "embodiments") will be described in detail below with reference to the drawings. The information processing apparatus, information processing method, and information processing program according to the present application are not limited to this embodiment. Further, each embodiment can be appropriately combined within a range that does not contradict the processing contents. Also, in each of the following embodiments, the same parts are denoted by the same reference numerals, and overlapping descriptions are omitted.

〔１．情報提供装置について〕
まず、図１を用いて、情報処理装置の一例である情報提供装置１０が実行する情報処理方法の一例について説明する。図１は、実施形態に係る情報提供装置が実行する処理の一例を示す図である。図１では、情報提供装置１０が実行する処理として、モデルの学習を行う学習処理と、学習済のモデル（以下、「学習モデル」と記載する場合がある。）を用いて情報の分類を行う分類処理の流れの一例について記載した。 [1. About the information providing device]
First, an example of an information processing method executed by an information providing apparatus 10, which is an example of an information processing apparatus, will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of processing executed by an information providing device according to an embodiment. In FIG. 1 , as processing executed by the information providing apparatus 10, learning processing for learning a model and information classification using a trained model (hereinafter sometimes referred to as a “learning model”) are performed. An example of the classification process flow has been described.

図１に示す情報提供装置１０は、情報処理装置であり、例えば、サーバ装置やクラウドシステム等により実現される。また、図１に示すデータサーバ１００は、各種のデータを管理しており、例えば、サーバ装置やクラウドシステム等により実現される。また、利用者端末２００は、分類処理の結果を利用する利用者により利用される端末装置であり、例えば、ＰＣ（Personal Computer）やサーバ装置、各種のスマートデバイス等により実現される。 An information providing device 10 shown in FIG. 1 is an information processing device, and is realized by, for example, a server device, a cloud system, or the like. Also, the data server 100 shown in FIG. 1 manages various data, and is realized by, for example, a server device, a cloud system, or the like. Also, the user terminal 200 is a terminal device used by a user who uses the result of the classification process, and is realized by, for example, a PC (Personal Computer), a server device, various smart devices, and the like.

ここで、情報提供装置１０は、データサーバ１００から学習データを取得し、取得した学習データが有する特徴をモデルに学習させる。そして、情報提供装置１０は、利用者端末２００から各種の測定データを取得すると、学習モデルを用いて、測定データが有する特徴に応じた分類処理を実行し、分類結果を利用者端末２００へと提供することとなる。 Here, the information providing apparatus 10 acquires learning data from the data server 100 and causes the model to learn the features of the acquired learning data. When various types of measurement data are acquired from the user terminal 200, the information providing apparatus 10 executes classification processing according to the characteristics of the measurement data using the learning model, and outputs the classification results to the user terminal 200. will be provided.

なお、このような一連の処理において、どのようなデータを学習データとするか、学習データが有するどのような特徴をモデルに学習させるか、どのようなデータを測定データとするか、どのような特徴に基づいた分類を行うかについては、任意の設定が採用可能である。具体的な例を挙げると、情報提供装置１０は、利用者のデモグラフィック属性やサイコグラフィック属性を示す属性情報、閲覧したコンテンツの履歴や取引対象（商品やサービス）の購入履歴、位置履歴等を示す各種履歴情報といった情報を学習データとして取得する。ここで、各学習データに対し、例えば、利用者が選択した広告の種別等を示す情報がラベルとして登録されている場合、情報提供装置１０は、モデルに各種の履歴情報を入力した際に、対応するラベル（すなわち、履歴情報と対応する利用者が選択した広告を示す情報）を示す情報を出力するように、モデルの学習を行う。そして、情報提供装置１０は、測定データとして利用者の各種属性情報を取得すると、取得した属性情報を学習モデルに入力することで、その利用者が選択する可能性が高い広告を推定する。 In this series of processes, what kind of data should be used as learning data, what kind of characteristics of the learning data should be learned by the model, what kind of data should be used as measurement data, what kind of Arbitrary settings can be adopted as to whether to perform feature-based classification. As a specific example, the information providing apparatus 10 stores attribute information indicating user demographic attributes and psychographic attributes, history of browsed content, purchase history of transaction targets (products and services), location history, and the like. Information such as various history information shown is acquired as learning data. Here, for each piece of learning data, for example, if information indicating the type of advertisement selected by the user is registered as a label, the information providing apparatus 10, when various types of history information are input to the model, The model is trained to output information indicating corresponding labels (that is, history information and corresponding information indicating advertisements selected by the user). After acquiring various types of attribute information of the user as measurement data, the information providing apparatus 10 inputs the acquired attribute information into the learning model, thereby estimating advertisements that the user is likely to select.

なお、このようなモデルは、複数のノードをそれぞれ個別の接続係数が設定された接続経路を介して接続したニューラルネットワーク、すなわちＤＮＮにより実現される。なお、モデルは、オートエンコーダやＣＮＮ（Convolutional Neural Network）、ＲＮＮ（Recurrent Neural Network）やその拡張であるＬＳＴＭ（Long short-term memory)）等、任意の構造を有するニューラルネットワークであってよい。 Such a model is realized by a neural network, that is, a DNN, in which a plurality of nodes are connected via connection paths set with individual connection coefficients. Note that the model may be a neural network having an arbitrary structure such as an autoencoder, CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), or its extension LSTM (Long short-term memory).

また、情報提供装置１０は、１つの学習データが有する特徴をモデルに学習させる場合、統計的勾配降下法を用いた学習を行うこととなる。例えば、情報提供装置１０は、モデルに対する入力情報と出力情報とに応じた任意の目的関数を設定し、設定した目的関数が所定の条件を満たすように、誤差逆伝播法（Backpropagation）等を用いてモデルが有する接続係数（すなわち、パラメータ）を修正することで、学習データが有する特徴をモデルに学習させることとなる。 Further, when the information providing apparatus 10 causes a model to learn the features of one piece of learning data, the learning is performed using the statistical gradient descent method. For example, the information providing apparatus 10 sets an arbitrary objective function according to input information and output information for a model, and uses error backpropagation or the like so that the set objective function satisfies a predetermined condition. By correcting the connection coefficients (that is, parameters) that the model has, the model learns the features that the learning data has.

なお、情報提供装置１０は、例えば、目的関数が入力情報と出力情報の誤差を示す場合、目的関数の値が小さくなるように、パラメータの修正を行うこととなる。一方、情報提供装置１０は、例えば、目的関数がクロスエントロピーに基づいた値を示す場合、目的関数の値が大きくなるように、パラメータの修正を行う。そこで、以下の説明においては、ある学習データを用いた学習前と学習後とで目的関数が所定の条件を満たす方向に変化した量を「改善量」と記載する場合がある。例えば、目的関数が入力情報と出力情報の誤差を示す場合、学習前の目的関数の値から学習後の目的関数の値を減算した値が「改善量」に対応し、目的関数がクロスエントロピーに基づいた値を示す場合、学習後の目的関数の値から学習前の目的関数の値を減算した値が「改善量」に対応する。 For example, when the objective function indicates an error between the input information and the output information, the information providing apparatus 10 corrects the parameters so that the value of the objective function becomes smaller. On the other hand, for example, when the objective function indicates a value based on cross entropy, the information providing device 10 modifies the parameters so that the value of the objective function increases. Therefore, in the following description, the amount by which the objective function changes in the direction that satisfies a predetermined condition between before and after learning using certain learning data may be referred to as an "improvement amount." For example, if the objective function indicates the error between the input information and the output information, the value obtained by subtracting the value of the objective function after learning from the value of the objective function before learning corresponds to the "improvement amount", and the objective function is the cross entropy. When indicating the value based on the learning, the value obtained by subtracting the value of the objective function before learning from the value of the objective function after learning corresponds to the "improvement amount".

〔１－１．ミニバッチ学習とモデル平均法について〕
ここで、情報提供装置１０が実行する学習処理の説明に先駆けて、ミニバッチ学習の概要について説明する。例えば、情報提供装置１０は、ミニバッチ学習を用いた学習処理を実行する場合、Ｎ個の全学習データからランダムに選択されたＭ個のデータをミニバッチとし、ミニバッチごとにモデルのパラメータの更新を行う。 [1-1. About mini-batch learning and model averaging]
Before explaining the learning process executed by the information providing apparatus 10, an outline of the mini-batch learning will be explained. For example, when executing a learning process using mini-batch learning, the information providing apparatus 10 uses M data randomly selected from all N learning data as a mini-batch, and updates model parameters for each mini-batch. .

ここで、このようなミニバッチ学習を並列に実行することで、学習を高速化する手法が考えられる。例えば、ＣＰＵやＧＰＵ、若しくはそれらのコアといった複数の演算装置のそれぞれに対して、それぞれ異なるミニバッチを配布し、各演算装置ごとに独立したモデルの学習処理を実行させる。そして、各演算装置による学習結果を同期させ、新たなミニバッチの配布を繰り替えす手法が考えられる。例えば、各演算装置により学習されたモデルのパラメータの平均値を用いて新たなモデルを生成し、各演算装置に新たなモデルを配布し、それぞれ異なるミニバッチを用いて再度学習処理を実行させる。 Here, a method of speeding up the learning by executing such mini-batch learning in parallel is conceivable. For example, a different mini-batch is distributed to each of a plurality of arithmetic units such as CPUs, GPUs, or their cores, and independent model learning processing is executed for each arithmetic unit. Then, a method of synchronizing the learning results of each arithmetic unit and repeatedly distributing new mini-batches is conceivable. For example, a new model is generated using the average values of the parameters of the model learned by each arithmetic unit, the new model is distributed to each arithmetic unit, and the learning process is executed again using different mini-batches.

このようなミニバッチ学習を行った場合、演算装置の数をＫ個とすると、学習処理に要する時間のうち各モデルの修正量の計算に要する時間を１／Ｋに短縮することが期待される。しかしながら、このようなミニバッチ学習では、各演算装置間で通信を行うことにより、各演算装置が独自に学習したモデルの同期を行う。このため、全学習データをＮ個、ミニバッチに含まれる学習データの数をＭ個とすると、１つのミニバッチごとに全演算装置の同期を行った場合は、Ｎ／Ｍ回の同期処理がオーバーヘッドとして処理時間に加わることとなる。 When such mini-batch learning is performed, if the number of arithmetic units is K, it is expected that the time required for calculating the correction amount of each model in the time required for learning processing is reduced to 1/K. However, in such mini-batch learning, the model independently learned by each arithmetic unit is synchronized by communicating between each arithmetic unit. Therefore, assuming that the total number of learning data is N and the number of learning data included in a mini-batch is M, when synchronization of all arithmetic units is performed for each mini-batch, N/M times of synchronization processing is required as an overhead. This will add to the processing time.

このような同期処理のオーバーヘッドを削減するため、ミニバッチ学習が所定回数行われる度に同期処理を行うモデル平均法が知られている。例えば、情報提供装置１０は、モデル平均法でモデルの学習を行う場合、ミニバッチの学習が行われる度に同期処理を実行するのではなく、各演算装置に対して所定の回数（例えば、Ｆ回）、ミニバッチ学習を行わせた後に、各演算装置のモデルを同期させる。このような処理を実行した場合、同期回数がＮ／（Ｍ×Ｆ）となるので、同期処理のオーバーヘッドを１／Ｆに削減することができる。 In order to reduce the overhead of such synchronization processing, a model averaging method is known in which synchronization processing is performed each time mini-batch learning is performed a predetermined number of times. For example, when performing model learning by the model averaging method, the information providing apparatus 10 does not perform synchronization processing each time mini-batch learning is performed, but a predetermined number of times (for example, F times) for each arithmetic device. ), and after performing mini-batch learning, the models of each arithmetic unit are synchronized. When such processing is executed, the number of times of synchronization becomes N/(M×F), so the overhead of synchronization processing can be reduced to 1/F.

〔１－２．学習処理について〕
しかしながら、上述したモデル平均法では、同期処理を減らしたことによりモデルの識別精度が劣化する恐れがある。また、ミニバッチ学習により最終的に得られるモデルは、それぞれ異なる学習データの特徴を学習したモデルの平均となるので、単一のモデルに対して全学習データを１つずつ入力し、学習データを入力する度に目的関数が改善するようにパラメータを修正したモデルの近似に過ぎない。このため、ミニバッチ学習やモデル平均法には、モデルの精度を向上させる余地があると言える。 [1-2. About learning process]
However, in the model averaging method described above, there is a possibility that the accuracy of model identification may be degraded due to the reduction in synchronization processing. Also, since the model that is finally obtained by mini-batch learning is the average of the models that learned the characteristics of each different training data, input all the training data one by one for a single model and input the training data It is just an approximation of a model with modified parameters so that the objective function improves each time. For this reason, it can be said that mini-batch learning and model averaging methods have room for improving model accuracy.

そこで、情報提供装置１０は、以下の学習処理を実行する。まず、情報提供装置１０は、配布された学習データを用いてそれぞれ個別にモデルの学習を行う複数の演算装置に対し、それぞれ異なる学習データを配布する。そして、情報提供装置１０は、配布された学習データを用いて各演算装置が実行した学習の結果に応じた態様で、各演算装置が学習したモデルの同期を行う。すなわち、情報提供装置１０は、単にモデルの単純平均を同期結果とするのではなく、各演算装置によるモデルの学習結果に応じて適用的にモデルの同期を行う。 Therefore, the information providing device 10 executes the following learning process. First, the information providing device 10 distributes different learning data to each of a plurality of computing devices that individually perform model learning using the distributed learning data. Then, the information providing device 10 synchronizes the model learned by each computing device in a mode according to the result of learning executed by each computing device using the distributed learning data. In other words, the information providing device 10 does not simply use a simple average of the models as a synchronization result, but adaptively synchronizes the models according to the learning results of the models by the arithmetic units.

例えば、改善量が大きいモデルは、学習データが有する特徴を適切に学習したモデルであると言える。このため、各演算装置が学習したモデルのうち、改善量が大きいモデルのパラメータは、最終的なモデルの精度に対して大きく寄与すると考えられる。そこで、情報提供装置１０は、改善量が大きいモデル程重要視されるようにモデルの同期を行う。すなわち、情報提供装置１０は、改善量が大きいモデルほど、より大きい重みを適用した状態で、各モデルを統合したモデルを同期結果として生成する。このような処理の結果、例えば、情報提供装置１０は、より改善量が大きいモデル、すなわち、最終的なモデルの精度をより高くすると推定されるモデルを優先的に用いた同期結果を生成することができるので、最終的に生成されるモデルの精度を向上させることができる。 For example, a model with a large amount of improvement can be said to be a model that has appropriately learned the features of the learning data. For this reason, among the models learned by each arithmetic unit, the parameters of the model with a large improvement amount are considered to greatly contribute to the accuracy of the final model. Therefore, the information providing apparatus 10 synchronizes the models so that the larger the improvement amount of the model, the more important it is. That is, the information providing apparatus 10 generates a model as a synchronization result by integrating each model in a state where a larger weight is applied to a model with a larger improvement amount. As a result of such processing, for example, the information providing apparatus 10 generates a synchronization result that preferentially uses a model with a greater amount of improvement, that is, a model that is estimated to increase the accuracy of the final model. can improve the accuracy of the final model.

また、情報提供装置１０は、学習処理におけるモデルの同期タイミングを動的に変更する。例えば、情報提供装置１０は、各演算装置が新たに配布された学習データを用いてモデルの学習を行う回数がランダムとなるように、各モデルを同期させる。より具体的には、情報提供装置１０は、区間［１，Ｆ］内における整数乱数を生成し、生成した乱数が示す回数のミニバッチ学習が行われた後で、各モデルの同期を行う。このような処理の結果、情報提供装置１０は、最終的に生成されるモデルの目的関数が極小値に陥る可能性を軽減することができるので、同期処理のオーバーヘッドを削減しつつ、最終的に生成されるモデルの精度を向上させることができると考えられる。 In addition, the information providing device 10 dynamically changes the model synchronization timing in the learning process. For example, the information providing device 10 synchronizes each model so that the number of times each computing device performs model learning using newly distributed learning data is random. More specifically, the information providing device 10 generates an integer random number within the interval [1, F], and synchronizes each model after mini-batch learning is performed the number of times indicated by the generated random number. As a result of such processing, the information providing apparatus 10 can reduce the possibility that the objective function of the model that is finally generated falls into a local minimum. It is believed that the accuracy of the generated model can be improved.

〔１－３．情報提供装置が実行する処理の流れの一例について〕
以下、図１を用いて、情報提供装置１０が実行する処理の流れの一例を説明する。なお、以下の説明では、演算装置として、ＧＰＵ＃１～＃ＫまでのＫ個のＧＰＵを並列に用いてモデルの学習を行う例について説明する。まず、情報提供装置１０は、データサーバ１００から学習データを取得する（ステップＳ１）。このような場合、情報提供装置１０は、学習データを学習データデータベース３１に登録する。そして、情報提供装置１０は、ミニバッチ学習において各ＧＰＵにより学習が行われたモデルの同期タイミングを動的に変動させる（ステップＳ２）。 [1-3. An example of the flow of processing executed by the information providing device]
An example of the flow of processing executed by the information providing apparatus 10 will be described below with reference to FIG. In the following description, an example in which model learning is performed using K GPUs #1 to #K in parallel as arithmetic units will be described. First, the information providing device 10 acquires learning data from the data server 100 (step S1). In such a case, the information providing device 10 registers learning data in the learning data database 31 . Then, the information providing device 10 dynamically changes the synchronization timing of the model trained by each GPU in the mini-batch learning (step S2).

例えば、情報提供装置１０は、学習データデータベース３１からランダムに学習データ＃１－１～＃１－Ｍを抽出する。続いて、情報提供装置１０は、抽出した学習データをＫ個のミニバッチに分割し、各ミニバッチをＧＰＵ＃１～＃Ｋに配布する。すなわち、情報提供装置１０は、Ｍ／Ｋ個の学習データをミニバッチとして各ＧＰＵ＃１～＃Ｋに配布する。換言すると、情報提供装置１０は、それぞれ異なる学習データを各演算装置に対して配布する。 For example, the information providing device 10 randomly extracts learning data #1-1 to #1-M from the learning data database 31. FIG. Subsequently, the information providing device 10 divides the extracted learning data into K mini-batches, and distributes each mini-batch to the GPUs #1 to #K. That is, the information providing device 10 distributes M/K pieces of learning data as a mini-batch to each of the GPUs #1 to #K. In other words, the information providing device 10 distributes different learning data to each computing device.

ここで、各ＧＰＵ＃１～＃Ｋは、配布された学習データを用いて、モデルの学習を行う。すなわち、各ＧＰＵ＃１～＃Ｋは、配布されたミニバッチを用いたミニバッチ学習を実行する。例えば、ＧＰＵ＃１は、学習対象となるモデルのコピーをＭ／Ｋ個だけ生成し、生成した各コピーに対してそれぞれ異なる学習データを入力する。そして、ＧＰＵ＃１は、コピーした各モデルの目的関数が改善するように、各モデルのパラメータを修正し、修正後のパラメータを統合したモデルを１回目の学習結果とする。例えば、ＧＰＵ＃１は、修正された各モデルのパラメータの平均値を学習結果となるモデルのパラメータとしてもよい。なお、ＧＰＵ＃１は、モデルのコピーを行わずに、Ｍ／Ｋ個の学習データを用いて、１つのモデルの学習を行ってもよい。また、他のＧＰＵ＃２～＃Ｋも同様に、各ＧＰＵ＃１～＃Ｋに対して個別に配布された学習データを用いたミニバッチ学習を実行する。 Here, each GPU #1 to #K performs model learning using the distributed learning data. That is, each GPU #1 to #K executes mini-batch learning using the distributed mini-batch. For example, GPU #1 generates M/K copies of a model to be learned, and inputs different learning data to each generated copy. Then, GPU #1 corrects the parameters of each model so that the objective function of each copied model is improved, and sets the model obtained by integrating the corrected parameters as the first learning result. For example, GPU #1 may use the average value of the corrected parameters of each model as the parameter of the model that is the learning result. Note that GPU #1 may learn one model using M/K pieces of learning data without copying the model. Similarly, the other GPUs #2 to #K also execute mini-batch learning using learning data individually distributed to each of the GPUs #1 to #K.

ここで、情報提供装置１０は、ミニバッチ学習が行われる度に、各ＧＰＵ＃１～＃Ｋの学習結果を同期させるのではなく、ランダムな回数でミニバッチ学習が行われる度に、各ＧＰＵ＃１～＃Ｋの学習結果を同期させる。例えば、情報提供装置１０は、所定の範囲内の乱数を生成し、例えば、生成された乱数が「３」である場合は、ミニバッチ学習を３回実行させる。 Here, instead of synchronizing the learning results of each GPU #1 to #K each time mini-batch learning is performed, the information providing apparatus 10 synchronizes each GPU #1 each time mini-batch learning is performed a random number of times. ~ Synchronize the learning results of #K. For example, the information providing device 10 generates a random number within a predetermined range, and, for example, when the generated random number is "3", mini-batch learning is performed three times.

例えば、情報提供装置１０は、学習データ＃１－１～＃１－ＭをＫ個のミニバッチ＃１－１～＃１－Ｋに分割し、各ミニバッチ＃１－１～＃１－Ｋをそれぞれ個別のＧＰＵ＃１～＃Ｋに配布して１回目のミニバッチ学習を実行させる。続いて、情報提供装置１０は、各ＧＰＵ＃１～＃Ｋのモデルを同期させることなく、新たな学習データ＃２－１～＃２－Ｍを学習データデータベース３１から抽出し、抽出した学習データ＃２－１～＃２－ＭをＫ個のミニバッチ＃２－１～＃２－Ｋに分割する。そして、情報提供装置１０は、各ミニバッチ＃２－１～＃２－Ｋをそれぞれ個別のＧＰＵ＃１～＃Ｋに配布して２回目のミニバッチ学習を実行させる。同様に、情報提供装置１０は、新たなミニバッチ＃３－１～＃３－Ｋを生成し、それぞれ個別のＧＰＵ＃１～＃Ｋに配布して３回目のミニバッチ学習を実行させる。そして、情報提供装置１０は、ミニバッチ学習を３回実行させた場合は、各ＧＰＵ＃１～＃Ｋのモデルを同期させる。 For example, the information providing device 10 divides the learning data #1-1 to #1-M into K mini-batches #1-1 to #1-K, and divides the mini-batches #1-1 to #1-K into It is distributed to individual GPUs #1 to #K to execute the first mini-batch learning. Subsequently, the information providing device 10 extracts new learning data #2-1 to #2-M from the learning data database 31 without synchronizing the models of the GPUs #1 to #K, and extracts the extracted learning data. #2-1 to #2-M are divided into K mini-batches #2-1 to #2-K. Then, the information providing apparatus 10 distributes the mini-batches #2-1 to #2-K to individual GPUs #1 to #K to execute the second mini-batch learning. Similarly, the information providing apparatus 10 generates new mini-batches #3-1 to #3-K, distributes them to individual GPUs #1 to #K, and causes them to perform the third mini-batch learning. Then, when mini-batch learning is executed three times, the information providing apparatus 10 synchronizes the models of the GPUs #1 to #K.

続いて、情報提供装置１０は、新たな乱数を生成し、例えば、生成された乱数が「１」である場合は、各ＧＰＵ＃１～＃Ｋに対して４回目のミニバッチ学習を実行させる。そして、情報提供装置１０は、４回目のミニバッチ学習の後に、各ＧＰＵ＃１～＃Ｋのモデルを同期させる。このように、情報提供装置１０は、各ＧＰＵ＃１～＃Ｋが実行するミニバッチ学習の回数がランダムとなるように、モデルの同期を実行する。すなわち、情報提供装置１０は、各ＧＰＵ＃１～＃Ｋのモデルの同期タイミングをランダムに変更する。 Subsequently, the information providing apparatus 10 generates a new random number, and, for example, if the generated random number is "1", causes each of the GPUs #1 to #K to perform mini-batch learning for the fourth time. After the fourth mini-batch learning, the information providing device 10 synchronizes the models of the GPUs #1 to #K. In this way, the information providing apparatus 10 executes model synchronization so that the number of times of mini-batch learning executed by each GPU #1 to #K is random. That is, the information providing device 10 randomly changes the synchronization timing of the models of the GPUs #1 to #K.

すなわち、情報提供装置１０は、学習データを用いてモデルが有するパラメータの値を修正することで、学習データが有する特徴をモデルに学習させる複数の演算装置に対し、それぞれ異なる学習データを配布する。そして、情報提供装置１０は、ランダムなタイミングで各モデルのパラメータの値を同期させ、各演算装置に新たな学習データを配布することで、新たに配布される学習データを用いて同期されたモデルの学習を行う処理を繰り返し実行する。 That is, the information providing device 10 corrects the parameter values of the model using the learning data, thereby distributing different learning data to a plurality of computing devices that cause the model to learn the features of the learning data. Then, the information providing device 10 synchronizes the parameter values of each model at random timing and distributes new learning data to each computing device, thereby generating a synchronized model using the newly distributed learning data. The process of learning is repeatedly executed.

ここで、情報提供装置１０は、各ＧＰＵ＃１～＃Ｋのモデルを同期させる場合、各モデルの目的関数の改善量に基づいた重みを適用して、各モデルのパラメータを合成したモデルに同期させる（ステップＳ３）。例えば、３回目のミニバッチ学習後にモデルの同期を行う場合、情報提供装置１０は、各モデルの目的関数の改善量を算出する。例えば、情報提供装置１０は、ＧＰＵ＃１が２回目のミニバッチ学習で生成したモデル＃１－２の目的関数の値と、３回目のミニバッチ学習で生成したモデル＃１－３の目的関数の値とから改善量＃１を算出する。同様に、情報提供装置１０は、ＧＰＵ＃２～＃Ｋが２回目のミニバッチ学習で生成したモデル＃２－２～＃Ｋ－２の目的関数の値と、３回目のミニバッチ学習で生成したモデル＃２－３～＃Ｋ－３の目的関数の値とから改善量＃２～＃Ｋを算出する。 Here, when synchronizing the models of the GPUs #1 to #K, the information providing apparatus 10 applies a weight based on the improvement amount of the objective function of each model, and synchronizes the parameters of each model with the synthesized model. (step S3). For example, when model synchronization is performed after the third mini-batch learning, the information providing device 10 calculates the improvement amount of the objective function of each model. For example, the information providing apparatus 10 outputs the value of the objective function of model #1-2 generated by GPU #1 in the second mini-batch learning and the value of the objective function of model #1-3 generated in the third mini-batch learning. and the improvement amount #1 is calculated. Similarly, the information providing apparatus 10 outputs the values of the objective functions of the models #2-2 to #K-2 generated by the GPUs #2 to #K in the second mini-batch learning and the models generated in the third mini-batch learning. The improvement amounts #2 to #K are calculated from the objective function values #2-3 to #K-3.

そして、情報提供装置１０は、各改善量＃１～＃Ｋの値に基づいた重み＃１～＃Ｋを設定する。例えば、情報提供装置１０は、各改善量＃１～＃Ｋの和で改善量＃１～＃Ｋの値をそれぞれ除算した値を重み＃１～＃Ｋとしてもよい。また、情報提供装置１０は、改善量の値が大きいモデルほど、より大きい重みとなるように、各重み＃１～＃Ｋを設定するのであれば、任意の方式により算出される重み＃１～＃Ｋを設定して良い。 Then, the information providing apparatus 10 sets weights #1 to #K based on the values of the improvement amounts #1 to #K. For example, the information providing apparatus 10 may set the weights #1 to #K by dividing the values of the improvement amounts #1 to #K by the sum of the improvement amounts #1 to #K. In addition, if the information providing apparatus 10 sets the weights #1 to #K so that the model with a larger improvement amount value has a higher weight, the weights #1 to #K calculated by an arbitrary method can be used. You may set #K.

続いて、情報提供装置１０は、各重み＃１～＃Ｋを用いて、各モデル＃１－３～＃Ｋ－３を統合する。例えば、情報提供装置１０は、モデル＃１－３のパラメータに重み＃１を積算した値を算出する。同様に、情報提供装置１０は、各モデル＃２－３～＃Ｋ－３のパラメータに対して、それぞれ個別の重み＃２～＃Ｋを積算した値を算出する。そして、情報提供装置１０は、算出した値を合計することで、各モデル＃１－３～＃Ｋ－３を統合したモデル＃Ｘ－４のパラメータを生成する。 Subsequently, the information providing apparatus 10 integrates the models #1-3 to #K-3 using the weights #1 to #K. For example, the information providing apparatus 10 calculates a value obtained by multiplying the parameters of the models #1-3 by the weight #1. Similarly, the information providing device 10 calculates values obtained by multiplying the parameters of the models #2-3 to #K-3 by individual weights #2 to #K. Then, the information providing apparatus 10 sums the calculated values to generate the parameters of the model #X-4 that integrates the models #1-3 to #K-3.

その後、情報提供装置１０は、モデル＃Ｘ－４を４回目の学習を行うモデルとして各ＧＰＵ＃１～＃Ｋに配布する。この結果、各ＧＰＵ＃１～＃Ｋは、モデル＃Ｘ－４に対し、それぞれ異なるミニバッチを用いたミニバッチ学習を実行することとなる。 After that, the information providing device 10 distributes the model #X-4 to each of the GPUs #1 to #K as a model for the fourth learning. As a result, each of the GPUs #1 to #K executes mini-batch learning using different mini-batches for the model #X-4.

また、情報提供装置１０は、所定の条件を満たすまで、新たな学習データを用いた学習を繰り返し実行する。そして、情報提供装置１０は、所定の条件が満たされた場合は、最終的な各モデルを統合した学習モデルを生成する（ステップＳ４）。例えば、情報提供装置１０は、学習データデータベース３１に登録された全ての学習データを用いたミニバッチ学習が行われるまで、学習処理を繰り返し実行する。そして、情報提供装置１０は、全ての学習データを用いたミニバッチ学習が完了した場合は、各ＧＰＵ＃１～＃Ｋのモデルを同期させる。例えば、情報提供装置１０は、各モデルのパラメータの平均を取ってもよく、各モデルの目的関数の値の改善量に応じた重みを考慮した統合を行ってもよい。 In addition, the information providing device 10 repeatedly executes learning using new learning data until a predetermined condition is satisfied. Then, when a predetermined condition is satisfied, the information providing device 10 generates a final learning model by integrating each model (step S4). For example, the information providing device 10 repeatedly executes the learning process until mini-batch learning is performed using all the learning data registered in the learning data database 31 . When the mini-batch learning using all the learning data is completed, the information providing device 10 synchronizes the models of the GPUs #1 to #K. For example, the information providing apparatus 10 may average the parameters of each model, or may perform integration in consideration of a weight according to the amount of improvement in the value of the objective function of each model.

そして、情報提供装置１０は、生成した学習モデルを用いた分類処理を実行する。例えば、情報提供装置１０は、利用者端末２００から測定データを取得する（ステップＳ５）。このような場合、情報提供装置１０は、測定データを学習モデルに入力し、学習モデルが出力した情報に基づいた分類結果を利用者端末２００に提供する（ステップＳ６）。なお、情報提供装置１０は、分類結果そのものの提供ではなく、例えば、分類結果に応じたコンテンツ配信等、分類結果に応した情報配信を行ってもよい。また、情報提供装置１０は、分類結果を利用者端末２００ではなく、利用者端末２００に対して各種サービスを提供するサービス提供サーバに提供してもよい。このような場合、サービス提供サーバは、分類結果に応じた内容のサービスを利用者端末２００に対して提供することとなる。 Then, the information providing device 10 executes classification processing using the generated learning model. For example, the information providing device 10 acquires measurement data from the user terminal 200 (step S5). In such a case, the information providing device 10 inputs the measured data to the learning model, and provides the user terminal 200 with the classification result based on the information output by the learning model (step S6). Note that the information providing apparatus 10 may perform information distribution according to the classification result, such as content distribution according to the classification result, instead of providing the classification result itself. Further, the information providing apparatus 10 may provide the classification result to a service providing server that provides various services to the user terminal 200 instead of the user terminal 200 . In such a case, the service providing server will provide the user terminal 200 with a service with content corresponding to the classification result.

〔１－４．同期タイミングについて〕
上述した説明では、情報提供装置１０は、ランダムなタイミングで各演算装置（すなわち、ＧＰＵ）によりミニバッチ学習が行われたモデルの同期を行った。しかしながら、実施形態は、これに限定されるものではない。情報提供装置１０は、モデルの同期タイミングを動的に変更するのであれば、任意の指標に基づいて、モデルの同期タイミングを決定して良い。 [1-4. About synchronization timing]
In the above description, the information providing apparatus 10 synchronizes the models for which mini-batch learning has been performed by each arithmetic unit (that is, GPU) at random timing. However, embodiments are not so limited. If the information providing apparatus 10 dynamically changes the model synchronization timing, the model synchronization timing may be determined based on any index.

例えば、情報提供装置１０は、各演算装置が実行した学習の結果が所定の条件を満たした場合は、各モデルを同期させてもよい。例えば、情報提供装置１０は、少なくともいずれかの演算装置により学習が行われたモデルの学習前における目的関数の値と学習後における目的関数の値との間の改善量が所定の条件を満たす場合は、各モデルを同期させてもよい。 For example, the information providing device 10 may synchronize each model when the result of learning executed by each arithmetic device satisfies a predetermined condition. For example, when the information providing device 10 satisfies a predetermined condition, the amount of improvement between the value of the objective function before learning and the value of the objective function after learning of the model trained by at least one of the computing devices satisfies a predetermined condition. may synchronize each model.

より具体的な例を挙げると、情報提供装置１０は、各モデルの目的関数の改善量を取得し、改善量が所定の閾値を超えたモデルの数を計数する。そして、情報提供装置１０は、計数した数が所定の閾値を超える場合は、各モデルの同期を行わせてもよい。例えば、情報提供装置１０は、改善量が所定の閾値を超えたモデルが１つでも存在する場合は、各モデルを同期させてもよい。例えば、情報提供装置１０は、改善量が所定の閾値を超えたモデルの重みを、他のモデルの重みよりも大きい値に設定し、各モデルの統合を行ってもよい。 As a more specific example, the information providing device 10 acquires the improvement amount of the objective function of each model, and counts the number of models whose improvement amount exceeds a predetermined threshold. Then, when the counted number exceeds a predetermined threshold, the information providing apparatus 10 may synchronize each model. For example, the information providing apparatus 10 may synchronize each model when there is even one model whose improvement amount exceeds a predetermined threshold. For example, the information providing apparatus 10 may set the weight of a model whose improvement amount exceeds a predetermined threshold to a value greater than the weight of other models, and integrate each model.

また、情報提供装置１０は、改善量の累積に応じた同期タイミングを設定してもよい。例えば、情報提供装置１０は、ミニバッチ学習を行う度に各モデルの改善量の累積を算出し、累積が所定の閾値を超えたモデルの数が所定の閾値を超えた場合に、同期を行ってもよい。また、情報提供装置１０は、全モデルの改善量の累積が所定の閾値を超えた場合に、同期を行ってもよい。また、情報提供装置１０は、同期を行う度に改善量の閾値をランダムに変更してもよい。 Further, the information providing apparatus 10 may set the synchronization timing according to the accumulated amount of improvement. For example, the information providing device 10 calculates the cumulative improvement amount of each model each time mini-batch learning is performed, and performs synchronization when the number of models whose cumulative total exceeds a predetermined threshold exceeds a predetermined threshold. good too. Further, the information providing apparatus 10 may perform synchronization when the cumulative amount of improvement of all models exceeds a predetermined threshold. Further, the information providing apparatus 10 may randomly change the improvement amount threshold each time synchronization is performed.

なお、情報提供装置１０は、改善量が所定の閾値を超えたモデルの数が所定の閾値を超えないように、同期を行ってもよい。例えば、情報提供装置１０は、改善量の履歴や累積等に基づいて、次回のミニバッチ学習を行った際に改善量が所定の閾値を超えるモデルの数が所定の閾値を超えるか否かを推定し、超えると推定された場合は、同期を行ってもよい。 Note that the information providing apparatus 10 may perform synchronization so that the number of models whose improvement amount exceeds a predetermined threshold does not exceed a predetermined threshold. For example, the information providing apparatus 10 estimates whether the number of models whose improvement amount exceeds a predetermined threshold exceeds a predetermined threshold when the next mini-batch learning is performed based on the history and accumulation of the improvement amount. and if estimated to exceed, synchronization may occur.

〔１－５．同期手法について〕
上述した説明では、情報提供装置１０は、各モデルの学習前における目的関数の値と学習後における目的関数の値との間の改善量に応じた重みを考慮して、各モデルのパラメータの荷重和を算出し、算出したパラメータの荷重和を同期後のモデルのパラメータとした。しかしながら、実施形態は、これに限定されるものではない。 [1-5. Synchronization method]
In the above description, the information providing apparatus 10 considers the weight according to the amount of improvement between the objective function value before learning and the objective function value after learning of each model, and weights the parameters of each model. The sum was calculated, and the weighted sum of the calculated parameters was used as the parameter of the model after synchronization. However, embodiments are not so limited.

例えば、情報提供装置１０は、各演算装置により学習が行われたモデルの目的関数の値に応じた態様で、各モデルの同期を行うのであれば、任意の処理を実行してもよい。例えば、情報提供装置１０は、目的関数の値が所定の閾値を超えたモデル（若しくは、目的関数の値が所定の閾値を下回ったモデル）のみを抽出し、抽出したモデルのパラメータの平均若しくは荷重和を同期結果としてもよい。 For example, the information providing device 10 may perform arbitrary processing as long as the models are synchronized in a manner corresponding to the value of the objective function of the model trained by each computing device. For example, the information providing apparatus 10 extracts only models whose objective function values exceed a predetermined threshold value (or models whose objective function values are less than a predetermined threshold value), and averages or weights the parameters of the extracted models. The sum may be the synchronization result.

また、情報提供装置１０は、各モデルの目的関数の値に応じた重みを考慮して、各モデルを統合したモデルを同期結果としてもよい。すなわち、情報提供装置１０は、改善量ではなく、目的関数の値そのものに応じた重みを考慮してもよい。例えば、情報提供装置１０は、各モデルの目的関数の値が低ければ低いほど（若しくは、高ければ高いほど）、より大きい値の重みを設定してもよい。 Further, the information providing apparatus 10 may take into consideration the weight according to the value of the objective function of each model, and take a model obtained by integrating each model as the synchronization result. That is, the information providing device 10 may consider the weight according to the value of the objective function itself instead of the improvement amount. For example, the information providing device 10 may set a larger value of weight as the value of the objective function of each model is lower (or higher).

また、情報提供装置１０は、目的関数の値が最も小さい（若しくは、最も大きい）モデルを特定し、特定したモデルを同期結果として各演算装置に配布してもよい。また、情報提供装置１０は、目的関数の改善量が最も大きいモデルを特定し、特定したモデルを同期結果として各演算装置に配布してもよい。 Further, the information providing device 10 may specify a model with the smallest (or largest) value of the objective function, and distribute the specified model to each computing device as a synchronization result. Further, the information providing device 10 may specify a model with the largest improvement in the objective function, and distribute the specified model to each computing device as a synchronization result.

また、情報提供装置１０は、目的関数の値に応じてモデルの選択を行う遺伝的アルゴリズムを用いて、各モデルの同期を行ってもよい。例えば、情報提供装置１０は、各モデルの目的関数の値若しくは目的関数の改善量を各モデルの適応度とし、適応度に応じた確率で選択されたモデルのコピー、交叉、もしくは突然変異（以下、「操作」と記載する場合がある。）を実行することで、次世代のモデルを生成する。例えば、情報提供装置１０は、２つのモデルを選択し、選択したモデルのパラメータをランダムに交叉させてもよく、選択したモデルのパラメータをランダムに変更してもよい。このような処理を実行することで、情報提供装置１０は、ｎ回目のミニバッチ学習が行われたＫ個のモデルから、ｎ＋１回目のミニバッチ学習の対象となる新たなＫ個のモデルを生成し、生成したＫ個のモデルを各演算装置に配布してもよい。また、情報提供装置１０は、各種任意の遺伝的アルゴリズムに基づいた態様で、モデルの同期を行ってもよい。 Further, the information providing apparatus 10 may synchronize each model using a genetic algorithm that selects a model according to the value of the objective function. For example, the information providing apparatus 10 sets the value of the objective function of each model or the amount of improvement of the objective function as the fitness of each model, and copies, crossovers, or mutations (hereinafter referred to as , may be described as “operation”) to generate the next-generation model. For example, the information providing device 10 may select two models, randomly intersect the parameters of the selected models, or randomly change the parameters of the selected models. By executing such processing, the information providing device 10 generates new K models to be subjected to the n+1th mini-batch learning from the K models that have undergone the n-th mini-batch learning, The generated K models may be distributed to each arithmetic unit. In addition, the information providing device 10 may perform model synchronization in a manner based on various arbitrary genetic algorithms.

なお、情報提供装置１０は、目的関数以外にも、各演算装置における学習の結果に応じて、各モデルの同期態様を適応的に変更するのであれば、任意の態様でモデルを同期させてよい。例えば、情報提供装置１０は、各演算装置がミニバッチ学習に要した時間に応じた重みを考慮して、各モデルのパラメータの荷重和を算出してもよく、時間に応じた確率で遺伝的アルゴリズムにおける各種操作の対象となるモデルの選択をおこなってよい。 In addition to the objective function, the information providing device 10 may synchronize the models in any manner as long as the synchronization manner of each model is adaptively changed according to the learning result in each arithmetic unit. . For example, the information providing device 10 may calculate the weighted sum of the parameters of each model in consideration of the weight according to the time required for mini-batch learning by each arithmetic device, and the genetic algorithm with probability according to time. You may select a model to be subjected to various operations in .

〔１－６．同期対象について〕
なお、情報提供装置１０は、全ての演算装置により学習が行われたモデルの同期を行わずともよい。例えば、情報提供装置１０は、複数の演算装置のうち、一部の演算装置により学習が行われたモデルの同期を行ってもよい。例えば、情報提供装置１０は、ＧＰＵ＃１～＃１０、ＧＰＵ＃１１～＃２０というように、各演算装置を所定数ずつのグループに分割し、グループごとの同期を行ってもよい。例えば、情報提供装置１０は、全ての演算装置により学習が行われたモデルのうち目的関数の改善量が所定の閾値を超えたモデルが存在する場合は、そのモデルの学習を行った演算装置を含むグループ内でのみ、モデルの同期を行ってもよい。なお、このような処理を行う場合、同期対象とならなかった演算装置は、前回のミニバッチ学習の結果となるモデルの学習を継続して行うこととなる。 [1-6. Synchronization target]
Note that the information providing device 10 does not have to synchronize the models trained by all the arithmetic devices. For example, the information providing device 10 may synchronize models trained by some of the plurality of computing devices. For example, the information providing apparatus 10 may divide each arithmetic unit into a predetermined number of groups, such as GPUs #1 to #10 and GPUs #11 to #20, and perform synchronization for each group. For example, if there is a model whose objective function improvement amount exceeds a predetermined threshold among the models trained by all the computing devices, the information providing device 10 selects the computing device that learned the model. Model synchronization may occur only within the containing group. It should be noted that when such processing is performed, the computing devices that are not subject to synchronization will continue learning the model that is the result of the previous mini-batch learning.

また、情報提供装置１０は、複数の演算装置のうち、通信遅延が所定の範囲内に収まる複数の演算装置により学習が行われたモデルの同期を行ってもよい。例えば、情報提供装置１０は、物理的に近傍に配置された所定の数の演算装置により学習が行われたモデルを同期対象としてもよい。 Further, the information providing apparatus 10 may synchronize models trained by a plurality of computing devices whose communication delay is within a predetermined range. For example, the information providing device 10 may synchronize a model that has been trained by a predetermined number of arithmetic devices that are physically located nearby.

また、情報提供装置１０は、複数の演算装置のうち、ランダムに選択された一部の演算装置により学習が行われたモデルの同期を行ってもよい。例えば、情報提供装置１０は、演算装置の中からランダムに選択した所定の数の演算装置により学習が行われたモデルのみを同期対象としてもよい。また、情報提供装置１０は、目的関数の値や改善量が所定の閾値を超えたモデルと、ランダムに選択された所定の数の演算装置により学習が行われたモデルのみを同期対象としてもよい。 Further, the information providing device 10 may synchronize a model that has been learned by some of the arithmetic devices that are randomly selected from among the plurality of arithmetic devices. For example, the information providing device 10 may synchronize only models trained by a predetermined number of arithmetic devices randomly selected from among the arithmetic devices. In addition, the information providing device 10 may synchronize only models whose objective function value or improvement amount exceeds a predetermined threshold and models that have been learned by a predetermined number of randomly selected arithmetic units. .

また、情報提供装置１０は、複数の演算装置のうち、各演算装置が演算可能な情報の次元数と、全ての演算装置に対して配布される学習データの数とに応じた数の演算装置により学習が行われたモデルの同期を行ってもよい。すなわち、情報提供装置１０は、各演算装置の性能や各演算装置の数、全学習データの数、ミニバッチとする学習データの数等に応じて、効率的に学習を行うことができると推定される数のモデルを同期対象としてもよい。 Further, the information providing device 10 has a plurality of computing devices, the number of computing devices corresponding to the number of dimensions of information that can be calculated by each computing device and the number of learning data distributed to all the computing devices. Synchronization of the model trained by is also possible. That is, the information providing apparatus 10 is presumed to be able to learn efficiently according to the performance of each arithmetic unit, the number of each arithmetic unit, the number of all learning data, the number of learning data to be mini-batches, and the like. Any number of models may be synchronized.

〔１－７．演算装置について〕
なお、上述した例では、演算装置として、複数のＧＰＵを用いる処理について記載したが、実施形態は、これに限定されるものではない。例えば、情報提供装置１０は、複数のＣＰＵに対して上述した学習処理を適用してもよく、例えば、１つ又は複数のコンピュータクラスタを１つの演算装置とみなし、このようなコンピュータクラスタをネットワークで接続したシステムに対して、上述した学習処理を適用してもよい。また、情報提供装置１０は、１つのＣＰＵやＧＰＵに含まれる複数のコアを演算装置と見做して上述した学習処理を実行してもよい。また、情報提供装置１０は、１つ若しくは複数のグラフィックカード上に配置されたＧＰＵやＧＰＵコアを演算装置と見做してもよい。また、情報提供装置１０は、複数のＣＰＵやＧＰＵを１つの演算装置と見做してもよく、これらのＣＰＵやＧＰＵに含まれる１つ若しくは複数のコアを１つの演算装置と見做してもよい。 [1-7. About arithmetic unit]
In the example described above, the processing using a plurality of GPUs as arithmetic devices has been described, but the embodiment is not limited to this. For example, the information providing device 10 may apply the learning process described above to a plurality of CPUs. The learning process described above may be applied to the connected system. Further, the information providing apparatus 10 may perform the above-described learning process by regarding a plurality of cores included in one CPU or GPU as an arithmetic device. In addition, the information providing device 10 may regard GPUs and GPU cores arranged on one or more graphic cards as arithmetic devices. Further, the information providing device 10 may regard a plurality of CPUs and GPUs as one arithmetic device, and regard one or more cores included in these CPUs and GPUs as one arithmetic device. good too.

また、情報提供装置１０は、上述した演算装置を自装置の筐体内に有していてもよく、異なる筐体内に有していてもよい。例えば、情報提供装置１０は、各種のネットワークを介して接続されたサーバ装置内にある演算装置を用いて、上述した学習処理を実行してもよい。 Further, the information providing device 10 may have the arithmetic device described above in its own housing, or may have it in a different housing. For example, the information providing device 10 may execute the above-described learning process using an arithmetic device in a server device connected via various networks.

すなわち、情報提供装置１０は、個別にモデルの学習を実行可能な装置を演算装置と見做して、上述した学習処理を実行するのであれば、任意の装置を演算装置と見做してもよい。なお、各演算装置が独立した記憶装置を有する必要はなく、例えば、各演算装置若しくは一部の演算装置は、メモリやレジスタ等の記憶装置を共用するものであってもよい。また、各演算装置は、例えば、所謂仮想マシン（Virtual Machine）であってもよい。 That is, the information providing device 10 regards a device capable of individually executing model learning as an arithmetic device, and regards any device as an arithmetic device as long as it executes the learning process described above. good. Note that each arithmetic unit does not need to have an independent storage device. For example, each arithmetic unit or a part of arithmetic units may share a storage device such as a memory or a register. Also, each computing device may be, for example, a so-called virtual machine.

〔１－８．実行主体について〕
なお、上述した学習処理は、任意の実行主体により実行されてよい。例えば、情報提供装置１０は、各演算装置とは別に、各演算装置の制御を行う制御装置を有してもよい。このような場合、かかる制御装置が、学習データの配布および同期を行ってもよい。また、同期タイミングの決定やモデルの同期処理については、各演算装置が協調して動作することで、実現されてもよい。 [1-8. About the executing body]
Note that the learning process described above may be executed by any execution subject. For example, the information providing device 10 may have a control device that controls each arithmetic device separately from each arithmetic device. In such cases, such controllers may perform training data distribution and synchronization. Further, determination of synchronization timing and model synchronization processing may be realized by cooperative operation of each arithmetic unit.

〔１－９．同期タイミングと同期手法の関係性について〕
また、情報提供装置１０は、上述した同期タイミングの動的な変更と、学習結果に応じたモデルの同期とをそれぞれ独立して実行してもよく、関連付けて実行してもよい。例えば、情報提供装置１０は、動的に同期タイミングを変更する場合、モデルの同期については、単純平均を算出することで実現してもよい。また、情報提供装置１０は、学習結果に応じた態様でモデルを同期させる場合、同期タイミングについては、動的に変化させる必要はない。 [1-9. Regarding the relationship between synchronization timing and synchronization method]
Further, the information providing apparatus 10 may execute the dynamic change of the synchronization timing and the synchronization of the model according to the learning result independently or in association with each other. For example, when the information providing apparatus 10 dynamically changes the synchronization timing, the model synchronization may be realized by calculating a simple average. Further, when synchronizing the models in a manner corresponding to the learning result, the information providing apparatus 10 does not need to dynamically change the synchronization timing.

また、例えば、情報提供装置１０は、同期処理を実行する度に、同期態様を変更してもよい。例えば、情報提供装置１０は、同期処理を実行する度に、単純な平均によりモデルを同期する手法、改善量に応じた重みを採用する手法、改善量が最も大きいモデルを同期結果とする手法等、複数の手法の中から、ランダム若しくは学習結果に応じた確率でいずれかの手法を選択し、選択した手法によりモデルの同期を行ってもよい。また、情報提供装置１０は、前回採用した同期手法に応じた同期タイミングで同期を行ってもよい。例えば、情報提供装置１０は、単純な平均によりモデルを同期した場合は、いずれかのモデルの改善量が所定の閾値を超えた場合に次の同期を行い、改善量に応じた重みを採用する手法によりモデルを同期させた場合は、次に同期するまでのミニバッチ学習の回数をランダムに選択してもよい。 Further, for example, the information providing apparatus 10 may change the synchronization mode each time the synchronization process is executed. For example, each time the information providing apparatus 10 executes synchronization processing, a method of synchronizing the models by simple averaging, a method of adopting a weight according to the amount of improvement, a method of determining the model with the largest amount of improvement as the result of synchronization, etc. Alternatively, one of a plurality of methods may be selected at random or with a probability according to the learning result, and model synchronization may be performed using the selected method. Further, the information providing apparatus 10 may perform synchronization at synchronization timing according to the synchronization method adopted last time. For example, when synchronizing models by simple averaging, the information providing apparatus 10 performs the next synchronization when the improvement amount of any model exceeds a predetermined threshold, and adopts a weight according to the improvement amount. If the method synchronizes the model, the number of mini-batch trainings until the next synchronization may be randomly selected.

〔２．機能構成の一例〕
以下、上記した学習処理を実現する情報提供装置１０が有する機能構成の一例について説明する。図２は、実施形態に係る情報提供装置の構成例を示す図である。図２に示すように、情報提供装置１０は、通信部２０、記憶部３０、第１演算部４０および第２演算部５０を有する。 [2. Example of functional configuration]
An example of the functional configuration of the information providing apparatus 10 that implements the learning process described above will be described below. FIG. 2 is a diagram illustrating a configuration example of an information providing device according to the embodiment; As shown in FIG. 2 , the information providing device 10 has a communication section 20 , a storage section 30 , a first calculation section 40 and a second calculation section 50 .

通信部２０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部２０は、ネットワークＮと有線または無線で接続され、例えば、データサーバ１００や利用者端末２００との間で情報の送受信を行う。 The communication unit 20 is realized by, for example, a NIC (Network Interface Card) or the like. The communication unit 20 is connected to the network N by wire or wirelessly, and transmits and receives information to and from the data server 100 and the user terminal 200, for example.

記憶部３０は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。また、記憶部３０は、学習データデータベース３１およびモデルデータベース３２を記憶する。 The storage unit 30 is implemented by, for example, a semiconductor memory device such as a RAM (Random Access Memory) or a flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 30 also stores a learning data database 31 and a model database 32 .

学習データデータベース３１は、学習データが登録される。例えば、学習データデータベース３１には、データサーバ１００から取得した各種の学習データが登録される。また、モデルデータベース３２には、上述した学習処理によって学習が行われた学習モデルのデータが登録される。 Learning data is registered in the learning data database 31 . For example, various learning data acquired from the data server 100 are registered in the learning data database 31 . In the model database 32, data of a learning model that has undergone learning through the above-described learning process is registered.

第１演算部４０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）等のプロセッサによって、情報提供装置１０内部の記憶装置に記憶されている各種プログラムがＲＡＭ等を作業領域として実行されることにより実現される。また、第１演算部４０は、コントローラ（controller）であり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現されてもよい。 The first computing unit 40 is a controller, and various programs stored in a storage device inside the information providing device 10 are executed by a processor such as a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). It is realized by being executed using a RAM or the like as a work area. Also, the first calculation unit 40 is a controller, and may be implemented by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

図２に示すように、第１演算部４０は、学習制御部４１および情報提供部４２を有する。学習制御部４１は、第２演算部５０を制御することで、上述した学習処理を実行する。例えば、学習制御部４１は、データサーバ１００から学習データを取得し、取得した学習データを学習データデータベース３１に登録する。また、学習制御部４１は、学習データデータベース３１に登録された学習データを第２演算部５０に提供し、上述した学習処理を実行させることで、学習モデルを取得する。そして、学習制御部４１は、学習モデルをモデルデータベース３２に登録する。 As shown in FIG. 2 , the first computing section 40 has a learning control section 41 and an information providing section 42 . The learning control unit 41 executes the learning process described above by controlling the second calculation unit 50 . For example, the learning control unit 41 acquires learning data from the data server 100 and registers the acquired learning data in the learning data database 31 . Further, the learning control unit 41 provides the learning data registered in the learning data database 31 to the second calculation unit 50, and executes the learning process described above, thereby obtaining a learning model. The learning control unit 41 then registers the learning model in the model database 32 .

情報提供部４２は、学習モデルを用いた測定データの分類結果を提供する。例えば、情報提供部４２は、利用者端末２００から測定データを取得すると、モデルデータベース３２から学習モデルを読み出し、読み出した学習モデルに測定データを入力する。そして、情報提供部４２は、学習モデルが出力した分類結果に応じた情報を利用者端末２００等に出力する。 The information providing unit 42 provides the classification result of the measurement data using the learning model. For example, when acquiring measurement data from the user terminal 200, the information providing unit 42 reads a learning model from the model database 32, and inputs the measurement data into the read learning model. Then, the information providing unit 42 outputs information according to the classification result output by the learning model to the user terminal 200 or the like.

第２演算部５０は、複数の演算装置を有する情報処理ユニットであり、例えば、複数のＧＰＵやＧＰＵコアが配置されたグラフィックカード、若しくは、複数のグラフィックカードにより実現される。例えば、第２演算部５０は、演算部５１、および演算制御部５２を有する。 The second calculation unit 50 is an information processing unit having a plurality of calculation devices, and is realized by, for example, a graphic card in which a plurality of GPUs or GPU cores are arranged, or a plurality of graphic cards. For example, the second calculation unit 50 has a calculation unit 51 and a calculation control unit 52 .

ここで、図３は、実施形態に係る第２演算部が有する機能構成の一例を示す図である。図３に示すように、演算部５１は、複数の演算装置を有する。なお、各演算装置は、例えば、ＧＰＵ若しくはＧＰＵのコアであり、配布された学習データのミニバッチを用いて、モデルの学習を行う。すなわち、各演算装置は、それぞれ独自にモデルを保持し、配布された学習データを用いてモデルが有するパラメータの値を修正することで、学習データが有する特徴をモデルに学習させる。 Here, FIG. 3 is a diagram showing an example of the functional configuration of the second calculation unit according to the embodiment. As shown in FIG. 3, the calculation unit 51 has a plurality of calculation devices. Note that each arithmetic unit is, for example, a GPU or a core of a GPU, and performs model learning using a mini-batch of distributed learning data. That is, each computing device holds its own model, and corrects the parameter values of the model using the distributed learning data, thereby allowing the model to learn the features of the learning data.

また、演算制御部５２は、配布部５２１と同期部５２２とを有する。なお、配布部５２１と同期部５２２は、第２演算部５０が情報提供装置１０内部の記憶装置に記憶されている各種プログラムがＲＡＭ等を作業領域として実行されることにより実現されることとなる。なお、演算制御部５２は、例えば、演算部５１が有する演算装置のうちいずれかにより実現されてもよい。 Further, the arithmetic control unit 52 has a distribution unit 521 and a synchronization unit 522 . The distributing unit 521 and the synchronizing unit 522 are realized by the second calculating unit 50 executing various programs stored in the storage device inside the information providing apparatus 10 using the RAM or the like as a work area. . Note that the calculation control unit 52 may be implemented by, for example, any one of the calculation devices included in the calculation unit 51 .

ここで、配布部５２１は、配布された学習データを用いてそれぞれ個別にモデルの学習を行う複数の演算装置に対し、それぞれ異なる学習データを配布する。例えば、配布部５２１は、演算部５１に演算装置がＫ個存在する場合、学習データからランダムに選択されたＭ個のデータをＫ個に分割したミニバッチを生成する。そして、配布部５２１は、生成したミニバッチをそれぞれ異なる演算装置へと配布することで、各演算装置にミニバッチ学習を実行させる。また、配布部５２１は、各演算装置がそれぞれミニバッチ学習を実行した場合は、新たなＭ個の学習データからＫ個のミニバッチを生成し、生成したミニバッチを再度各演算装置へと配布する。 Here, the distributing unit 521 distributes different learning data to each of the plurality of arithmetic units that individually perform model learning using the distributed learning data. For example, when the computing unit 51 has K computing devices, the distributing unit 521 generates mini-batches by dividing M data randomly selected from learning data into K mini-batches. Then, the distribution unit 521 distributes the generated mini-batch to different computing devices, thereby causing each computing device to perform mini-batch learning. Further, when each computing device executes mini-batch learning, the distributing unit 521 generates K mini-batches from the new M learning data, and distributes the generated mini-batches to each computing device again.

同期部５２２は、配布された学習データを用いて各演算装置が実行した学習の結果に応じた態様で、各演算装置が学習したモデルの同期を行う。より具体的には、同期部５２２は、各モデルのパラメータの値を同期させ、同期後のパラメータの値を各演算装置に配布する。すなわち、同期部５２２は、同期結果となるモデルを各演算装置に配布し、ミニバッチ学習を継続させる。 The synchronization unit 522 synchronizes the model learned by each computing device in a mode according to the result of the learning executed by each computing device using the distributed learning data. More specifically, the synchronization unit 522 synchronizes the parameter values of each model and distributes the synchronized parameter values to each arithmetic unit. That is, the synchronizing unit 522 distributes the model resulting from synchronization to each arithmetic unit and continues the mini-batch learning.

また、同期部５２２は、複数の演算装置により学習が行われたモデルの同期機会を動的に変更する。例えば、同期部５２２は、各演算装置が新たに配布された学習データを用いて前記モデルの学習を行う回数がランダムとなるように、各モデルを同期させる。より具体的には、同期部５２２は、ランダムに選択された回数、各演算装置がミニバッチ学習を実行した場合は、各演算装置により学習が行われたモデルの目的関数の値に応じた態様で、各モデルの同期を行う。そして、同期部５２２は、再度、ランダムに選択された回数、各モデルのミニバッチ学習を実行させる。 Also, the synchronization unit 522 dynamically changes synchronization opportunities for models trained by a plurality of arithmetic units. For example, the synchronizing unit 522 synchronizes each model so that the number of times each arithmetic unit learns the model using newly distributed learning data is random. More specifically, the synchronizing unit 522 performs mini-batch learning for a randomly selected number of times, and when each computing device performs mini-batch learning, in a manner according to the value of the objective function of the model trained by each computing device. , synchronize each model. Then, the synchronization unit 522 again executes the mini-batch learning of each model for the randomly selected number of times.

例えば、同期部５２２は、各演算装置がミニバッチ学習を実行した場合は、各演算装置が個別に学習を行ったモデルの目的関数の値を特定し、特定した目的関数の値に応じた重みを考慮して、各モデルを統合したモデルを同期結果とする。例えば、同期部５２２は、各モデルの学習前における目的関数の値と学習後における目的関数の値との間の改善量に応じた重みを考慮して、各モデルを統合したモデルを同期結果としてもよい。そして、同期部５２２は、同期結果となるモデルを各演算装置に配布し、再度ミニバッチ学習を実行させる。 For example, when each arithmetic unit executes mini-batch learning, the synchronization unit 522 identifies the value of the objective function of the model that each arithmetic unit has individually learned, and assigns a weight according to the value of the identified objective function. Considering this, a model that integrates each model is taken as a synchronization result. For example, the synchronization unit 522 considers the weight according to the amount of improvement between the objective function value before learning and the objective function value after learning of each model, and the model that integrates each model as a synchronization result. good too. Then, the synchronization unit 522 distributes the model resulting from synchronization to each arithmetic unit, and causes the mini-batch learning to be executed again.

なお、同期部５２２は、各モデルのうち、学習前における目的関数の値と学習後における目的関数の値との間の改善量が最も大きいモデルを同期結果として各演算装置に配布してもよい。また、同期部５２２は、目的関数の値に応じてモデルの選択を行う遺伝的アルゴリズムを用いて、各モデルの同期を行ってもよい。 Note that the synchronizing unit 522 may distribute the model with the largest improvement amount between the objective function value before learning and the objective function value after learning among the models as the synchronization result to each arithmetic device. . Also, the synchronization unit 522 may synchronize each model using a genetic algorithm that selects a model according to the value of the objective function.

また、同期部５２２は、各演算装置が実行した学習の結果が所定の条件を満たした場合に、各モデルを同期させてもよい。例えば、同期部５２２は、ミニバッチ学習の度に各演算装置のモデルの目的関数の値を改善量を取得し、少なくともいずれかのモデルの改善量が所定の条件を満たす場合は、各モデルを同期させてもよい。 Further, the synchronization unit 522 may synchronize each model when the result of learning executed by each arithmetic unit satisfies a predetermined condition. For example, the synchronization unit 522 acquires the value of the objective function of the model of each arithmetic unit and the amount of improvement each time mini-batch learning is performed, and synchronizes each model if the amount of improvement of at least one of the models satisfies a predetermined condition. You may let

また、同期部５２２は、複数の演算装置のうち、一部の演算装置により学習が行われたモデルの同期を行ってもよい。例えば、同期部５２２は、複数の演算装置のうち、通信遅延が所定の範囲内に収まる複数の演算装置により学習が行われたモデルの同期を行ってもよい。また、同期部５２２は、複数の演算装置のうち、ランダムに選択された一部の演算装置により学習が行われたモデルの同期を行ってもよい。 Further, the synchronization unit 522 may synchronize the models trained by some of the plurality of arithmetic devices. For example, the synchronizing unit 522 may synchronize models trained by a plurality of computing devices whose communication delay is within a predetermined range. Further, the synchronization unit 522 may synchronize a model that has been trained by some of the arithmetic devices that are randomly selected from among the plurality of arithmetic devices.

また、同期部５２２は、複数の演算装置のうち、各演算装置が演算可能な情報の次元数と、全ての演算装置に対して配布される学習データの数とに応じた数の演算装置により学習が行われたモデルの同期を行ってもよい。すなわち、同期部５２２は、各演算装置の性能や学習データの数等に応じて、同期対象とする演算装置の数を動的に変更してもよい。 In addition, the synchronization unit 522 uses the number of arithmetic units corresponding to the number of dimensions of information that can be calculated by each arithmetic unit and the number of learning data distributed to all the arithmetic units among the plurality of arithmetic units. Synchronization of trained models may be performed. That is, the synchronization unit 522 may dynamically change the number of arithmetic units to be synchronized according to the performance of each arithmetic unit, the number of learning data, and the like.

また、同期部５２２は、全ての学習データを配布した場合や、各モデルの改善量が継続して変化しなかった場合は、学習終了条件が満たされたと判定し、各モデルを統合した学習モデルを生成する。そして、同期部５２２は、学習モデルを第１演算部４０に出力する。 In addition, the synchronization unit 522 determines that the learning end condition is satisfied when all the learning data is distributed or when the improvement amount of each model does not continue to change, and the learning model that integrates each model to generate The synchronization unit 522 then outputs the learning model to the first calculation unit 40 .

〔３．情報提供装置が実行する処理の流れについて〕
次に、図４を用いて、情報提供装置１０が実行する処理の流れの一例について説明する。図４は、実施形態に係る情報提供装置が実行する処理の流れの一例を示すフローチャートである。 [3. Regarding the flow of processing executed by the information providing device]
Next, an example of the flow of processing executed by the information providing apparatus 10 will be described with reference to FIG. FIG. 4 is a flowchart illustrating an example of the flow of processing executed by the information providing device according to the embodiment;

例えば、情報提供装置１０は、未配布の学習データからＭ個のデータをランダムに抽出する（ステップＳ１０１）。続いて、情報提供装置１０は、抽出したＭ個のデータをＫ個のグループに分割し、各グループのデータをそれぞれ異なる演算ユニットに配布する（ステップＳ１０２）。そして、情報提供装置１０は、所定の学習終了条件を満たすか否かを判定する（ステップＳ１０３）。 For example, the information providing device 10 randomly extracts M pieces of data from undistributed learning data (step S101). Subsequently, the information providing apparatus 10 divides the extracted M data into K groups, and distributes the data of each group to different arithmetic units (step S102). Then, the information providing device 10 determines whether or not a predetermined learning end condition is satisfied (step S103).

ここで、情報提供装置１０は、学習終了条件が満たされないと判定した場合は（ステップＳ１０３：Ｎｏ）、各モデルの目的関数が所定の同期条件を満たすか否かを判定し（ステップＳ１０４）、満たすと判定した場合は（ステップＳ１０４：Ｙｅｓ）、各モデルの目的関数の改善値に応じた重みで、各モデルを同期させたモデルを生成する（ステップＳ１０５）。そして、情報提供装置１０は、各演算ユニットに新たなモデルを配布し（ステップＳ１０６）、ステップＳ１０１を再度実行する。また、情報提供装置１０は、各モデルの目的関数が所定の同期条件を満たさないと判定した場合も（ステップＳ１０４：Ｎｏ）、ステップＳ１０１を再度実行する。 Here, when the information providing apparatus 10 determines that the learning end condition is not satisfied (step S103: No), it determines whether the objective function of each model satisfies a predetermined synchronization condition (step S104), If it is determined that the condition is satisfied (step S104: Yes), a model is generated by synchronizing each model with a weight according to the improvement value of the objective function of each model (step S105). Then, the information providing device 10 distributes a new model to each arithmetic unit (step S106), and executes step S101 again. Further, when the information providing apparatus 10 determines that the objective function of each model does not satisfy the predetermined synchronization condition (step S104: No), it also executes step S101 again.

そして、情報提供装置１０は、学習終了条件が満たされると判定した場合は（ステップＳ１０３：Ｙｅｓ）、各演算ユニットのモデルを同期させた学習モデルを生成し（ステップＳ１０７）、処理を終了する。 Then, when the information providing apparatus 10 determines that the learning end condition is satisfied (step S103: Yes), it generates a learning model in which the models of the arithmetic units are synchronized (step S107), and ends the process.

〔４．変形例〕
上記では、情報提供装置１０による処理の一例について説明した。しかしながら、実施形態は、これに限定されるものではない。以下、情報提供装置１０が実行する処理のバリエーションについて説明する。 [4. Modification]
An example of processing by the information providing apparatus 10 has been described above. However, embodiments are not so limited. Variations of processing executed by the information providing apparatus 10 will be described below.

〔４－１．装置構成〕
記憶部３０に登録された各データベース３１、３２は、外部のストレージサーバに保持されていてもよく、また、第１演算部４０や第２演算部５０が個別に保持する各種の記憶装置内に保持されていてもよい。また、情報提供装置１０は、第２演算部５０を筐体内に有する必要はなく、例えば、外付けの筐体内に有していてもよい。また、情報提供装置１０は、複数の第２演算部５０を有し、各第２演算部５０が有する演算装置を統合的に用いた学習処理を実行してもよい。 [4-1. Device configuration〕
Each database 31, 32 registered in the storage unit 30 may be held in an external storage server, or may be stored in various storage devices individually held by the first calculation unit 40 and the second calculation unit 50. may be retained. Further, the information providing apparatus 10 does not need to have the second calculation unit 50 inside the housing, and may have it inside an external housing, for example. Further, the information providing device 10 may have a plurality of second calculation units 50 and perform learning processing using the calculation devices of the second calculation units 50 in an integrated manner.

〔４－２．その他〕
また、上記実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、逆に、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [4-2. others〕
Further, among the processes described in the above embodiments, all or part of the processes described as being automatically performed can be manually performed, and conversely, the processes described as being performed manually can be performed manually. can also be performed automatically by known methods. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each drawing is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Also, each component of each device illustrated is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution and integration of each device is not limited to the one shown in the figure, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured.

また、上記してきた各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Moreover, each of the embodiments described above can be appropriately combined within a range that does not contradict the processing contents.

〔４－３．プログラム〕
また、上述した実施形態に係る情報提供装置１０は、例えば図５に示すような構成のコンピュータ１０００によって実現される。図５は、ハードウェア構成の一例を示す図である。コンピュータ１０００は、出力装置１０１０、入力装置１０２０と接続され、第１演算装置１０３０、第２演算装置１０３１、一次記憶装置１０４０、二次記憶装置１０５０、出力ＩＦ（Interface）１０６０、入力ＩＦ１０７０、ネットワークＩＦ１０８０がバス１０９０により接続された形態を有する。 [4-3. program〕
Also, the information providing apparatus 10 according to the above-described embodiment is implemented by a computer 1000 configured as shown in FIG. 5, for example. FIG. 5 is a diagram illustrating an example of a hardware configuration; The computer 1000 is connected to an output device 1010, an input device 1020, a first arithmetic device 1030, a second arithmetic device 1031, a primary storage device 1040, a secondary storage device 1050, an output IF (Interface) 1060, an input IF 1070, and a network IF 1080. are connected by a bus 1090 .

第１演算装置１０３０は、一次記憶装置１０４０や二次記憶装置１０５０に格納されたプログラムや入力装置１０２０から読み出したプログラム等に基づいて動作し、各種の処理を実行する。一次記憶装置１０４０は、ＲＡＭ等、第１演算装置１０３０が各種の演算に用いるデータを一次的に記憶するメモリ装置である。また、二次記憶装置１０５０は、第１演算装置１０３０が各種の演算に用いるデータや、各種のデータベースが登録される記憶装置であり、ＲＯＭ(Read Only Memory)、ＨＤＤ（Hard Disk Drive）、フラッシュメモリ等により実現される。 The first arithmetic unit 1030 operates based on programs stored in the primary storage device 1040 and the secondary storage device 1050, programs read from the input device 1020, and the like, and executes various processes. The primary storage device 1040 is a memory device, such as a RAM, that temporarily stores data used for various calculations by the first arithmetic device 1030 . The secondary storage device 1050 is a storage device in which data used for various calculations by the first arithmetic device 1030 and various databases are registered. It is implemented by a memory or the like.

第２演算装置１０３１は、上述したモデルの学習を行う演算装置、すなわち、複数のコアを有する。例えば、第２演算装置１０３１は、ＧＰＵが設置されたグラフィックカード等により実現される。 The second arithmetic unit 1031 has an arithmetic unit for learning the model described above, that is, a plurality of cores. For example, the second arithmetic unit 1031 is realized by a graphic card or the like in which a GPU is installed.

出力ＩＦ１０６０は、モニタやプリンタといった各種の情報を出力する出力装置１０１０に対し、出力対象となる情報を送信するためのインタフェースであり、例えば、ＵＳＢ（Universal Serial Bus）やＤＶＩ（Digital Visual Interface）、ＨＤＭＩ（登録商標）（High Definition Multimedia Interface）といった規格のコネクタにより実現される。また、入力ＩＦ１０７０は、マウス、キーボード、およびスキャナ等といった各種の入力装置１０２０から情報を受信するためのインタフェースであり、例えば、ＵＳＢ等により実現される。 The output IF 1060 is an interface for transmitting information to be output to the output device 1010 that outputs various types of information such as a monitor and a printer. It is realized by a connector conforming to a standard such as HDMI (registered trademark) (High Definition Multimedia Interface). Also, the input IF 1070 is an interface for receiving information from various input devices 1020 such as a mouse, keyboard, scanner, etc., and is realized by, for example, USB.

なお、入力装置１０２０は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等から情報を読み出す装置であってもよい。また、入力装置１０２０は、ＵＳＢメモリ等の外付け記憶媒体であってもよい。 Note that the input device 1020 includes, for example, optical recording media such as CDs (Compact Discs), DVDs (Digital Versatile Discs), PDs (Phase change rewritable discs), magneto-optical recording media such as MOs (Magneto-Optical discs), and tapes. It may be a device that reads information from a medium, a magnetic recording medium, a semiconductor memory, or the like. Also, the input device 1020 may be an external storage medium such as a USB memory.

ネットワークＩＦ１０８０は、ネットワークＮを介して他の機器からデータを受信して第１演算装置１０３０へ送り、また、ネットワークＮを介して第１演算装置１０３０が生成したデータを他の機器へ送信する。 The network IF 1080 receives data from other equipment via the network N and sends the data to the first arithmetic unit 1030, and also transmits data generated by the first arithmetic unit 1030 to other equipment via the network N.

第１演算装置１０３０は、出力ＩＦ１０６０や入力ＩＦ１０７０を介して、出力装置１０１０や入力装置１０２０の制御を行う。例えば、第１演算装置１０３０は、入力装置１０２０や二次記憶装置１０５０からプログラムを一次記憶装置１０４０上にロードし、ロードしたプログラムを実行する。 The first arithmetic unit 1030 controls the output device 1010 and the input device 1020 via the output IF 1060 and the input IF 1070 . For example, the first arithmetic unit 1030 loads a program from the input device 1020 or the secondary storage device 1050 onto the primary storage device 1040 and executes the loaded program.

例えば、コンピュータ１０００が情報提供装置１０として機能する場合、コンピュータ１０００の第１演算装置１０３０は、一次記憶装置１０４０上にロードされたプログラムまたはデータを実行することにより、第１演算部４０の機能を実現し、第２演算装置１０３１は、一次記憶装置１０４０上にロードされたプログラムまたはデータを実行することにより、第２演算部５０として動作する。コンピュータ１０００の第１演算装置１０３０および第２演算装置１０３１は、これらのプログラムまたはデータを一次記憶装置１０４０から読み取って実行するが、他の例として、他の装置からネットワークＮを介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the information providing device 10, the first arithmetic device 1030 of the computer 1000 performs the functions of the first arithmetic unit 40 by executing the program or data loaded on the primary storage device 1040. The second arithmetic device 1031 operates as the second arithmetic unit 50 by executing the program or data loaded on the primary storage device 1040 . The first arithmetic unit 1030 and the second arithmetic unit 1031 of the computer 1000 read and execute these programs or data from the primary storage device 1040. As another example, these programs are read from other devices via the network N. may be obtained.

〔５．効果〕
上述したように、情報提供装置１０は、配布された学習データを用いてそれぞれ個別にモデルの学習を行う複数の演算装置に対し、それぞれ異なる学習データを配布する。そして、情報提供装置１０は、配布された学習データを用いて各演算装置が実行した学習の結果に応じた態様で、各演算装置が学習したモデルの同期を行う。このような処理の結果、情報提供装置１０は、より適切な学習が行われたモデルを重視した同期を実現するので、複数の演算装置を用いたモデルの学習精度を改善することができる。 [5. effect〕
As described above, the information providing device 10 distributes different learning data to a plurality of computing devices that individually perform model learning using the distributed learning data. Then, the information providing device 10 synchronizes the model learned by each computing device in a mode according to the result of learning executed by each computing device using the distributed learning data. As a result of such processing, the information providing apparatus 10 realizes synchronization that places importance on the model that has undergone more appropriate learning, so that the learning accuracy of the model using a plurality of arithmetic units can be improved.

また、情報提供装置１０は、各演算装置により学習が行われたモデルの目的関数の値に応じた態様で、各モデルの同期を行う。例えば、情報提供装置１０は、各モデルの目的関数の値に応じた重みを考慮して、各モデルを統合したモデルを同期結果とする。また、例えば、情報提供装置１０は、各モデルの学習前における目的関数の値と学習後における目的関数の値との間の改善量に応じた重みを考慮して、各モデルを統合したモデルを同期結果とする。なお、例えば、情報提供装置１０は、各モデルのうち、学習前における目的関数の値と学習後における目的関数の値との間の改善量が最も大きいモデルを同期結果としてもよい。また、情報提供装置１０は、目的関数の値に応じてモデルの選択を行う遺伝的アルゴリズムを用いて、各モデルの同期を行ってもよい。このような処理の結果、情報提供装置１０は、各演算装置が個別に学習したモデルのうち、より精度の改善に寄与すると考えられるモデルを重視した同期を行うことができるので、モデルの学習精度を改善することができる。 Further, the information providing device 10 synchronizes each model in a mode according to the value of the objective function of the model trained by each arithmetic device. For example, the information providing device 10 considers the weight according to the value of the objective function of each model, and takes a model obtained by integrating each model as a synchronization result. Further, for example, the information providing apparatus 10 considers the weight according to the amount of improvement between the value of the objective function before learning of each model and the value of the objective function after learning, and creates a model that integrates each model. Synchronize result. Note that, for example, the information providing apparatus 10 may take the model with the largest improvement amount between the value of the objective function before learning and the value of the objective function after learning among the models as the synchronization result. Further, the information providing apparatus 10 may synchronize each model using a genetic algorithm that selects a model according to the value of the objective function. As a result of such processing, the information providing device 10 can perform synchronization with an emphasis on models that are considered to contribute to improvement in accuracy among the models that are individually learned by each computing device. can be improved.

また、情報提供装置１０は、学習データを用いてモデルが有するパラメータの値を修正することで、当該学習データが有する特徴をモデルに学習させる複数の演算装置に対し、それぞれ異なる学習データを配布し、各モデルのパラメータの値を同期させる。また、情報提供装置１０は、新たに配布される学習データを用いて同期部により同期されたモデルの学習を行う複数の演算装置に対し、それぞれ異なる学習データを新たに配布する。このため、情報提供装置１０は、各種ニューラルネットワークの学習精度を改善できる。 Further, the information providing apparatus 10 corrects the parameter values of the model using the learning data, thereby distributing different learning data to a plurality of computing devices that cause the model to learn the features of the learning data. , to synchronize the values of the parameters of each model. In addition, the information providing device 10 newly distributes different learning data to a plurality of computing devices that perform model learning synchronized by the synchronization unit using the newly distributed learning data. Therefore, the information providing device 10 can improve the learning accuracy of various neural networks.

また、情報提供装置１０は、各演算装置によってモデルの学習が行われる度に、各演算装置に対してそれぞれ異なる学習データを新たに配布し、複数の演算装置により学習が行われたモデルの同期機会を動的に変更する。 In addition, the information providing apparatus 10 newly distributes different learning data to each arithmetic unit each time the model is learned by each arithmetic unit, and synchronizes the models trained by the plural arithmetic units. Change opportunities dynamically.

例えば、情報提供装置１０は、各演算装置が新たに配布された学習データを用いてモデルの学習を行う回数がランダムとなるように、各モデルを同期させる。また、情報提供装置１０は、各演算装置が実行した学習の結果が所定の条件を満たした場合は、各モデルを同期させる。また、情報提供装置１０は、少なくともいずれかの演算装置により学習が行われたモデルの学習前における目的関数の値と学習後における目的関数の値との間の改善量が所定の条件を満たす場合は、各モデルを同期させる。このため、情報提供装置１０は、同期処理によるオーバーヘッドの増大を防ぎつつ、モデルの学習精度を向上させることができる。 For example, the information providing device 10 synchronizes each model so that the number of times each computing device performs model learning using newly distributed learning data is random. Further, the information providing device 10 synchronizes each model when the result of learning executed by each arithmetic device satisfies a predetermined condition. In addition, the information providing device 10, when the improvement amount between the value of the objective function before learning and the value of the objective function after learning of the model trained by at least one of the arithmetic devices satisfies a predetermined condition. synchronizes each model. Therefore, the information providing apparatus 10 can improve the learning accuracy of the model while preventing an increase in overhead due to synchronization processing.

また、情報提供装置１０は、複数の演算装置のうち、一部の演算装置により学習が行われたモデルの同期を行う。例えば、情報提供装置１０は、複数の演算装置のうち、通信遅延が所定の範囲内に収まる複数の演算装置により学習が行われたモデルの同期を行う。また、例えば、情報提供装置１０は、複数の演算装置のうち、ランダムに選択された一部の演算装置により学習が行われたモデルの同期を行う。このため、情報提供装置１０は、同期処理におけるオーバーヘッドをさらに削減することができる。 In addition, the information providing device 10 synchronizes models that have been trained by some of the plurality of computing devices. For example, the information providing apparatus 10 synchronizes a model learned by a plurality of arithmetic devices whose communication delay is within a predetermined range among the plurality of arithmetic devices. Further, for example, the information providing apparatus 10 synchronizes a model that has been learned by some of the arithmetic devices that are randomly selected from among the plurality of arithmetic devices. Therefore, the information providing apparatus 10 can further reduce overhead in synchronization processing.

また、情報提供装置１０は、複数の演算装置のうち、各演算装置が演算可能な情報の次元数と、全ての演算装置に対して配布される学習データの数とに応じた数の演算装置により学習が行われたモデルの同期を行う。このため、情報提供装置１０は、より効率的なモデルの学習を実現できる。 Further, the information providing device 10 has a plurality of computing devices, the number of computing devices corresponding to the number of dimensions of information that can be calculated by each computing device and the number of learning data distributed to all the computing devices. Synchronize the model trained by Therefore, the information providing device 10 can realize more efficient model learning.

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 As described above, some of the embodiments of the present application have been described in detail based on the drawings. It is possible to carry out the invention in other forms with modifications.

また、上記してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、検出部は、検出手段や検出回路に読み替えることができる。 Also, the "section, module, unit" described above can be read as "means" or "circuit". For example, the detection unit can be read as detection means or a detection circuit.

１０情報提供装置
２０通信部
３０記憶部
３１学習データデータベース
３２モデルデータベース
４０第１演算部
４１学習制御部
４２情報提供部
５０第２演算部
５１演算部
５２演算制御部
５２１配布部
５２２同期部
１００データサーバ
２００利用者端末 10 information providing device 20 communication section 30 storage section 31 learning data database 32 model database 40 first calculation section 41 learning control section 42 information provision section 50 second calculation section 51 calculation section 52 calculation control section 521 distribution section 522 synchronization section 100 data Server 200 User terminal

Claims

a distributing unit that newly distributes different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronizing unit for synchronizing the model learned by each computing unit when one of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; ,
The information processing apparatus, wherein the synchronization unit synchronizes the models using a genetic algorithm that selects the model according to the value of the objective function of the model trained by each arithmetic unit.

2. The information processing apparatus according to claim 1, wherein the synchronization unit considers a weight according to the value of the objective function of each model, and sets a model obtained by integrating each model as a synchronization result.

The synchronization unit considers a weight corresponding to an improvement amount between the objective function value of each model before learning and the objective function value after learning, and sets a model obtained by integrating each model as a synchronization result. 3. The information processing apparatus according to claim 1 or 2, characterized by:

4. The synchronizing unit, among the models, sets the model having the largest improvement between the value of the objective function before learning and the value of the objective function after learning as the synchronizing result. The information processing apparatus according to any one of

The distribution unit corrects the parameter values of the model using the learning data, thereby distributing different learning data to a plurality of computing devices that cause the model to learn the features of the learning data. ,
The information processing apparatus according to any one of claims 1 to 4, wherein the synchronization unit synchronizes values of parameters of each model.

3. The distributing unit newly distributes different learning data to a plurality of arithmetic units that perform model learning synchronized by the synchronization unit using the newly distributed learning data. 6. The information processing device according to any one of 1 to 5.

a distributing unit that newly distributes different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronizing unit for synchronizing the model learned by each computing unit when one of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; ,
The information processing apparatus, wherein the synchronization unit synchronizes each model when a result of learning executed by each arithmetic unit satisfies a predetermined condition.

The synchronizing unit, when the improvement amount between the value of the objective function before learning and the value of the objective function after learning of the model trained by at least one of the arithmetic devices satisfies a predetermined condition, The information processing apparatus according to claim 7, wherein the models are synchronized.

a distributing unit that newly distributes different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronizing unit for synchronizing the model learned by each computing unit when one of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; ,
The synchronizing unit synchronizes the models learned by some of the plurality of arithmetic devices, the models learned by the plurality of arithmetic devices having communication delays within a predetermined range. An information processing device characterized by:

a distributing unit that newly distributes different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronizing unit for synchronizing the model learned by each computing unit when one of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; ,
The synchronizing unit is a model learned by some of the plurality of arithmetic units, and has a number of dimensions of information that can be calculated by each arithmetic unit and learning distributed to all the arithmetic units. An information processing device characterized by synchronizing a model trained by a number of arithmetic units corresponding to the number of data.

An information processing method executed by an information processing device,
a distributing step of distributing different learning data each time the model is learned by each computing device to a plurality of computing devices that individually learn a model using the distributed learning data;
a synchronization step of synchronizing the model learned by each computing device when one of the learned models executed by each computing device using the distributed learning data satisfies a predetermined condition,
The information processing method, wherein the synchronizing step synchronizes each model using a genetic algorithm that selects a model according to the value of the objective function of the model trained by each arithmetic unit.

a distribution procedure for distributing different learning data each time the model is learned by each computing device to a plurality of computing devices that individually learn a model using the distributed learning data;
a synchronization procedure for synchronizing the model learned by each computing unit when any of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; An information processing program for execution,
The information processing program, wherein the synchronization procedure synchronizes each model using a genetic algorithm that selects a model according to the value of the objective function of the model trained by each arithmetic unit.

An information processing method executed by an information processing device,
a distribution step of newly distributing different learning data each time the model is learned by each computing device to a plurality of computing devices that individually learn a model using the distributed learning data;
a synchronization step of synchronizing the model learned by each computing device when one of the learned models executed by each computing device using the distributed learning data satisfies a predetermined condition,
The information processing method, wherein the synchronizing step synchronizes each model when a result of learning executed by each arithmetic unit satisfies a predetermined condition.

a distribution procedure for newly distributing different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronization procedure for synchronizing the model learned by each computing unit when any of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; An information processing program for execution,
The information processing program, wherein the synchronization procedure synchronizes each model when a result of learning executed by each arithmetic unit satisfies a predetermined condition.

An information processing method executed by an information processing device,
a distribution step of newly distributing different learning data each time the model is learned by each computing device to a plurality of computing devices that individually learn a model using the distributed learning data;
a synchronization step of synchronizing the model learned by each computing device when one of the learned models executed by each computing device using the distributed learning data satisfies a predetermined condition,
The synchronizing step synchronizes the models learned by a part of the plurality of arithmetic devices, and the models learned by the plurality of arithmetic devices whose communication delay is within a predetermined range. An information processing method characterized by:

a distribution procedure for newly distributing different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronization procedure for synchronizing the model learned by each computing unit when any of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; An information processing program for execution,
The synchronization procedure synchronizes the models learned by some of the plurality of arithmetic units, and the models learned by the plurality of arithmetic units having communication delays within a predetermined range. An information processing program characterized by:

An information processing method executed by an information processing device,
a distribution step of newly distributing different learning data each time the model is learned by each computing device to a plurality of computing devices that individually learn a model using the distributed learning data;
a synchronization step of synchronizing the model learned by each computing device when one of the learned models executed by each computing device using the distributed learning data satisfies a predetermined condition,
In the synchronization step, a model learned by a part of the plurality of arithmetic units, the number of dimensions of information that can be calculated by each arithmetic unit, and the learning distributed to all the arithmetic units. An information processing method characterized by synchronizing a model trained by a number of arithmetic units corresponding to the number of data.

a distribution procedure for newly distributing different learning data each time the model is learned by each arithmetic unit to a plurality of arithmetic units that individually learn a model using the distributed learning data;
a synchronization procedure for synchronizing the model learned by each computing unit when any of the learned models executed by each computing unit using the distributed learning data satisfies a predetermined condition; An information processing program for execution,
The synchronization procedure is a model learned by a part of the plurality of arithmetic units, the number of dimensions of information that can be calculated by each arithmetic unit, and the learning distributed to all the arithmetic units. An information processing program characterized by synchronizing a model trained by a number of arithmetic units corresponding to the number of data.