JP2017182319A

JP2017182319A - Machine learning device

Info

Publication number: JP2017182319A
Application number: JP2016066356A
Authority: JP
Inventors: 健太西行; Kenta Nishiyuki; 長谷川　弘; Hiroshi Hasegawa; 弘長谷川; 基康田中; Motoyasu Tanaka; 藤吉　弘亘; Hironobu Fujiyoshi; 弘亘藤吉
Original assignee: MegaChips Corp
Current assignee: MegaChips Corp
Priority date: 2016-03-29
Filing date: 2016-03-29
Publication date: 2017-10-05

Abstract

PROBLEM TO BE SOLVED: To provide a machine learning device that can generate a neural network of an appropriate scale.SOLUTION: An intermediate layer selecting unit 12 selects intermediate layers 22 and 23 out of intermediate layers 22 to 25 contained in an already learned neural network 3A. A tentative output layer adding unit 13 generates a neural network 3B by adding a tentative output layer 32 connected to an intermediate layer 22 and a tentative output layer 33 connected to an intermediate layer 23 to the neural network 3A. An arithmetic unit 15 performs arithmetic operation using the neural network 3B and acquires a tentative output value outputted from each of the tentative output layers 32 and 33. A deletion determining unit 16 calculates levels of similarity by using the tentative output value acquired by the arithmetic unit 15, and determines which intermediate layer is to be deleted out of the intermediate layers 22 and 23. A reconfiguration unit 17 deletes one or the other of the intermediate layers from the neural network 3A.SELECTED DRAWING: Figure 1

Description

本発明は、ニューラルネットワークの学習を行う機械学習装置に関する。 The present invention relates to a machine learning device that performs learning of a neural network.

ニューラルネットワークは、カメラにより撮影された画像から人物などの所定の物体を検出する物体検出装置や、センサにより計測されたデータを解析する解析装置などに採用されている。例えば、ニューラルネットワークを物体検出装置に用いる場合、機械学習装置が、画像に含まれる所定の物体の特徴をニューラルネットワークに学習させる。学習を終了したニューラルネットワークをコンピュータなどに実装することにより、物体検出装置が作成される。 The neural network is employed in an object detection device that detects a predetermined object such as a person from an image taken by a camera, an analysis device that analyzes data measured by a sensor, and the like. For example, when a neural network is used for an object detection device, a machine learning device causes a neural network to learn features of a predetermined object included in an image. An object detection apparatus is created by mounting a neural network that has finished learning on a computer or the like.

ニューラルネットワークは、その規模が大きくなるほど学習精度が高くなる傾向にある。しかし、ニューラルネットワークの規模が大きくなるにつれて、ニューラルネットワークの演算量が増大する。ニューラルネットワークを物体検出装置や、データ解析装置に適用するためには、ニューラルネットワークの演算量はできるだけ少ないことが望ましい。このため、高精度で、演算量の少ないニューラルネットネットワークを作成する技術の開発が望まれている。 A neural network tends to have higher learning accuracy as its scale increases. However, as the scale of the neural network increases, the amount of computation of the neural network increases. In order to apply the neural network to an object detection device or a data analysis device, it is desirable that the amount of calculation of the neural network is as small as possible. Therefore, it is desired to develop a technique for creating a neural network with high accuracy and a small amount of calculation.

非特許得文献１には、学習済みの大規模なニューラルネットワークの出力結果を利用して、小規模なニューラルネットワークの学習を行う技術が開示されている。小規模なニューラルネットワークの学習には、誤差逆伝播法（バックプロパゲーション）が用いられる。誤差逆伝播法は、教師あり学習のアルゴリズムである。 Non-Patent Document 1 discloses a technique for learning a small-scale neural network using an output result of a learned large-scale neural network. The error back-propagation method (back propagation) is used for learning of a small-scale neural network. The error back propagation method is a supervised learning algorithm.

”Distilling the Knowledge in a Neural Network”, [online], [平成28年2月9日検索], インターネット<URL: https://www.cs.toronto.edu/~hinton/absps/distillation.pdf>“Distilling the Knowledge in a Neural Network”, [online], [searched February 9, 2016], Internet <URL: https://www.cs.toronto.edu/~hinton/absps/distillation.pdf>

ニューラルネットワークを物体検出装置や、データ解析装置などに使用する場合、ニューラルネットワークの規模を予め決定する必要がある。ニューラルネットワークの規模は、具体的には、ニューラルネットワークが有する層の数や、各層が有するノードの数により決定される。上述のように、ニューラルネットワークの規模を大きくすることにより、ニューラルネットワークの学習精度を向上させることができる。そのため、ニューラルネットワークの規模は、予め要求される学習精度に基づいて決定される。 When the neural network is used for an object detection device, a data analysis device, or the like, it is necessary to determine the size of the neural network in advance. Specifically, the scale of the neural network is determined by the number of layers that the neural network has and the number of nodes that each layer has. As described above, the learning accuracy of the neural network can be improved by increasing the scale of the neural network. Therefore, the scale of the neural network is determined based on learning accuracy required in advance.

しかし、要求される学習精度を実現するために、ニューラルネットワークの規模に余裕を持たせる場合がある。つまり、ニューラルネットワークの規模が、要求される学習精度に比べて過大に設定される。ニューラルネットワークの規模が過大である場合、学習済みのニューラルネットワークが、同じ計算を繰り返し行っている可能性がある。この場合、学習済みのニューラルネットワークを、要求される学習精度に応じた適切な規模にまで小さくすることにより、演算量を削減することが望ましい。しかし、学習済みのニューラルネットワークの規模を縮小する技術は開発されていない。 However, in order to achieve the required learning accuracy, there is a case where there is a margin in the scale of the neural network. That is, the scale of the neural network is set excessively compared to the required learning accuracy. If the scale of the neural network is excessive, the learned neural network may be performing the same calculation repeatedly. In this case, it is desirable to reduce the amount of calculation by reducing the learned neural network to an appropriate scale according to the required learning accuracy. However, no technology has been developed to reduce the scale of a learned neural network.

本発明の目的は、適切な規模のニューラルネットワークを作成することができる機械学習装置を提供することである。 An object of the present invention is to provide a machine learning device capable of creating a neural network having an appropriate scale.

上記課題を解決するために、請求項１記載の発明は、機械学習装置であって、少なくとも２つの中間層を備える学習済みのニューラルネットワークを取得する取得部と、前記少なくとも２つの中間層の中から、第１中間層及び第２中間層を選択する中間層選択部と、前記第１中間層が有するノードと接続された仮出力ノードを有する第１仮出力層と、前記第２中間層が有するノードと接続された仮出力ノードを有する第２仮出力層とを前記ニューラルネットワークに追加して、仮ネットワークを生成する仮出力層追加部と、テストデータを前記仮ネットワークに入力して前記仮ネットワークを用いた演算を実行し、前記第１仮出力層が有する仮出力ノードから出力される第１仮出力値と、前記第２仮出力層が有する仮出力ノードから出力される第２仮出力値とを取得する演算部と、前記第１仮出力値と前記第２仮出力値とを用いて前記第１中間層の機能が第２中間層の機能に類似する度合いを示す類似度を計算し、計算した類似度に基づいて前記１中間層を削除するか否かを判断する削除判断部と、前記削除判断部が前記第１中間層を削除すると判断した場合、前記ニューラルネットワークから前記第１中間層を削除する再構成部と、を備える。 In order to solve the above-mentioned problem, an invention according to claim 1 is a machine learning device, comprising: an acquisition unit that acquires a learned neural network including at least two intermediate layers; and an intermediate between the at least two intermediate layers. From the intermediate layer selector that selects the first intermediate layer and the second intermediate layer, the first temporary output layer having a temporary output node connected to the node of the first intermediate layer, and the second intermediate layer A temporary output layer adding unit that generates a temporary network by adding a second temporary output layer having a temporary output node connected to the node having the temporary output node; An operation using a network is executed, and the first temporary output value output from the temporary output node included in the first temporary output layer and the temporary output node included in the second temporary output layer are output. The degree of similarity of the function of the first intermediate layer to the function of the second intermediate layer is shown using the arithmetic unit for obtaining the second temporary output value, and the first temporary output value and the second temporary output value. Calculating a similarity, and determining whether to delete the one intermediate layer based on the calculated similarity; and when the deletion determining unit determines to delete the first intermediate layer, the neural network A reconfiguration unit that deletes the first intermediate layer from the network.

請求項２記載の発明は、請求項１に記載の機械学習装置であって、前記第１中間層は、前記第２中間層よりも前記ニューラルネットワークが備える入力層に近い位置に配置される。 A second aspect of the present invention is the machine learning device according to the first aspect, wherein the first intermediate layer is disposed closer to the input layer included in the neural network than the second intermediate layer.

請求項３記載の発明は、請求項１又は請求項２に記載の機械学習装置であって、前記第１仮出力層が有する仮出力ノードの数は、前記第２仮出力層が有する仮出力ノードと同じである。 The invention according to claim 3 is the machine learning device according to claim 1 or 2, wherein the number of temporary output nodes included in the first temporary output layer is the temporary output included in the second temporary output layer. Same as node.

請求項４記載の発明は、請求項３に記載の機械学習装置であって、前記第１仮出力層が有する仮出力ノードの数は、前記ニューラルネットワークにおける出力層が有する出力ノードの数と同じである。 The invention according to claim 4 is the machine learning device according to claim 3, wherein the number of temporary output nodes of the first temporary output layer is the same as the number of output nodes of the output layer in the neural network. It is.

請求項５記載の発明は、請求項１ないし請求項４のいずれかに記載の機械学習装置であって、前記仮出力層追加部は、前記第２中間層が有するノードが前記ニューラルネットワークにおける出力層が有するノードと接続されている場合、前記第２仮出力層として、前記出力層を使用することを決定する。 A fifth aspect of the present invention is the machine learning device according to any one of the first to fourth aspects, wherein the provisional output layer adding unit outputs nodes in the second intermediate layer in the neural network. If the layer is connected to a node, the output layer is determined to be used as the second temporary output layer.

請求項６記載の発明は、請求項１ないし請求項５のいずれかに記載の機械学習装置であって、さらに、前記第１中間層が有するノードと前記第１仮出力層が有するノードとの間の信号経路に設定される重み係数と、前記第２中間層が有するノードと前記第２仮出力層が有するノードとの間の信号経路に設定される重み係数とを更新するための学習を実行する追加学習部、を備える。 A sixth aspect of the present invention is the machine learning device according to any one of the first to fifth aspects, further comprising a node included in the first intermediate layer and a node included in the first temporary output layer. Learning to update the weighting factor set in the signal path between and the weighting factor set in the signal path between the node of the second intermediate layer and the node of the second temporary output layer An additional learning unit to be executed.

請求項７記載の発明は、請求項１ないし請求項６のいずれかに記載の機械学習装置であって、さらに、前記再構成部は、前記第１中間層が前記ニューラルネットワークの入力層から数えてｋ（ｋは２以上ｍ−１以下の自然）番目の層である場合、（ｋ−１）番目の層が有するノードと（ｋ＋１）番目の層が有するノードとを新たに接続し、（ｋ−１）番目の層が有するノードと、（ｋ＋１）番目の層が有するノードの間の信号経路に設定される重み係数を更新するための学習を実行する。 A seventh aspect of the present invention is the machine learning device according to any one of the first to sixth aspects, wherein the reconfiguration unit counts the first intermediate layer from the input layer of the neural network. And k (k is a natural value of 2 to m−1) th layer, the node of the (k−1) th layer is newly connected to the node of the (k + 1) th layer, Learning for updating the weighting factor set in the signal path between the node of the (k-1) th layer and the node of the (k + 1) th layer is executed.

請求項８記載の発明は、機械学習方法であって、少なくとも２つの中間層を備える学習済みのニューラルネットワークを取得するステップと、前記少なくとも２つの中間層の中から、第１中間層及び第２中間層を選択するステップと、前記第１中間層が有するノードと接続された仮出力ノードを有する第１仮出力層と、前記第２中間層が有するノードと接続された仮出力ノードを有する第２仮出力層とを前記ニューラルネットワークに追加して、仮ネットワークを生成するステップと、テストデータを前記仮ネットワークに入力して前記仮ネットワークを用いた演算を実行し、前記第１仮出力層が有する仮出力ノードから出力される第１仮出力値と、前記第２仮出力層が有する仮出力ノードから出力される第２仮出力値とを取得するステップと、前記第１仮出力値と前記第２仮出力値とを用いて前記第１中間層の機能が第２中間層の機能に類似する度合いを示す類似度を計算し、計算した類似度に基づいて前記１中間層を削除するか否かを判断するステップと、前記第１中間層を削除すると判断された場合、前記ニューラルネットワークから前記第１中間層を削除するステップと、を備える。 The invention according to claim 8 is a machine learning method, wherein a learned neural network including at least two intermediate layers is obtained, and a first intermediate layer and a second intermediate layer are selected from the at least two intermediate layers. Selecting an intermediate layer; a first temporary output layer having a temporary output node connected to a node of the first intermediate layer; and a first output node having a temporary output node connected to a node of the second intermediate layer. 2 temporary output layers are added to the neural network to generate a temporary network; test data is input to the temporary network and an operation using the temporary network is performed; Obtaining a first temporary output value output from the temporary output node having and a second temporary output value output from the temporary output node of the second temporary output layer. Then, using the first temporary output value and the second temporary output value, a degree of similarity indicating the degree of similarity of the function of the first intermediate layer to the function of the second intermediate layer is calculated, and based on the calculated similarity Determining whether or not to delete the first intermediate layer, and deleting the first intermediate layer from the neural network when it is determined to delete the first intermediate layer.

本発明に係る機械学習装置は、学習済みのニューラルネットワークを取得する。中間層選択部は、取得したニューラルネットワークに含まれる２つの中間層を選択する。仮出力層追加部は、選択した２つの中間層の各々に接続される仮出力層をニューラルネットワークに追加して、仮ネットワークを生成する。仮ネットワークを用いた演算により、仮出力値が、仮出力層から出力される。削除判断部は、仮出力値に基づいて２つの中間層の類似度を算出し、算出した類似度に基づいて２つの中間層のうちいずれか一方の中間層を削除することができるか否かを判断する。再構成部は、いずれか一方の中間層を削除できると判断された場合、ニューラルネットワークからいずれか一方の中間層を削除する。これにより、機械学習装置は、要求される学習精度に応じた適切な規模のニューラルネットワークを生成することができる。 The machine learning device according to the present invention acquires a learned neural network. The intermediate layer selection unit selects two intermediate layers included in the acquired neural network. The temporary output layer adding unit adds a temporary output layer connected to each of the two selected intermediate layers to the neural network to generate a temporary network. A temporary output value is output from the temporary output layer by a calculation using the temporary network. The deletion determining unit calculates the similarity between the two intermediate layers based on the temporary output value, and determines whether one of the two intermediate layers can be deleted based on the calculated similarity. Judging. When it is determined that one of the intermediate layers can be deleted, the reconstruction unit deletes one of the intermediate layers from the neural network. As a result, the machine learning device can generate a neural network of an appropriate scale according to the required learning accuracy.

本発明の実施の形態に係る機械学習装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the machine learning apparatus which concerns on embodiment of this invention. 図１に示す機械学習装置に入力されるニューラルネットワークの構成の一例を示す図である。It is a figure which shows an example of a structure of the neural network input into the machine learning apparatus shown in FIG. 図１に示す機械学習装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the machine learning apparatus shown in FIG. 図１に示す中間層選択部により選択された２つの中間層に接続される仮出力層を示す図である。It is a figure which shows the temporary output layer connected to two intermediate | middle layers selected by the intermediate | middle layer selection part shown in FIG. 図２に示すニューラルネットワークから中間層を削除したネットワークを示す図である。It is a figure which shows the network which deleted the intermediate | middle layer from the neural network shown in FIG. 図５に示すニューラルネットワークに追加された仮出力層を示す図である。It is a figure which shows the temporary output layer added to the neural network shown in FIG. 図５に示すニューラルネットワークから中間層を削除したネットワークを示す図である。It is a figure which shows the network which deleted the intermediate | middle layer from the neural network shown in FIG. 図１に示す機械学習装置の変形例の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the modification of the machine learning apparatus shown in FIG.

以下、図面を参照し、本発明の実施の形態を詳しく説明する。図中同一又は相当部分には同一符号を付してその説明は繰り返さない。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the drawings, the same or corresponding parts are denoted by the same reference numerals and description thereof will not be repeated.

｛１．機械学習装置１００の概略｝
図１は、本発明の実施の形態に係る機械学習装置１００の構成を示す機能ブロック図である。図１に示すように、機械学習装置１００は、学習済みのニューラルネットワーク３Ａを入力する。機械学習装置１００は、ニューラルネットワーク３Ａが有する中間層のうち、削除することが可能な中間層を特定する。機械学習装置１００は、ニューラルネットワーク３Ａから特定した中間層を削除したニューラルネットワークを出力する。 {1. Outline of Machine Learning Device 100}
FIG. 1 is a functional block diagram showing a configuration of a machine learning device 100 according to an embodiment of the present invention. As shown in FIG. 1, the machine learning device 100 inputs a learned neural network 3A. The machine learning device 100 specifies an intermediate layer that can be deleted from the intermediate layers of the neural network 3A. The machine learning device 100 outputs a neural network in which the identified intermediate layer is deleted from the neural network 3A.

図２は、機械学習装置１００に入力されるニューラルネットワーク３Ａの構成の一例を示す図である。図２に示すように、ニューラルネットワーク３Ａは、６層のパーセプトロンであり、入力層２１と、中間層２２〜２５と、出力層２６とを備える。入力層２１は、入力ノード２１ａ〜２１ｄを備える。中間層２２は、ノード２２ａ〜２２ｄを備える。中間層２３は、ノード２３ａ〜２３ｄを備える。中間層２４は、ノード２４ａ〜２４ｄを備える。中間層２５は、ノード２５ａ〜２５ｄを備える。出力層２６は、出力ノード２６ａ〜２６ｃを備える。出力ノード２６ａ〜２６ｃは、通常出力値４６ａ〜４６ｃを出力する。以下、通常出力値４６ａ〜４６ｃを総称する場合、通常出力値４６と記載する。 FIG. 2 is a diagram illustrating an example of the configuration of the neural network 3 A input to the machine learning device 100. As shown in FIG. 2, the neural network 3 A is a six-layer perceptron, and includes an input layer 21, intermediate layers 22 to 25, and an output layer 26. The input layer 21 includes input nodes 21a to 21d. The intermediate layer 22 includes nodes 22a to 22d. The intermediate layer 23 includes nodes 23a to 23d. The intermediate layer 24 includes nodes 24a to 24d. The intermediate layer 25 includes nodes 25a to 25d. The output layer 26 includes output nodes 26a to 26c. The output nodes 26a to 26c output normal output values 46a to 46c. Hereinafter, the normal output values 46a to 46c are collectively referred to as the normal output value 46.

ニューラルネットワーク３Ａは、上述のように学習済みであり、例えば、カメラにより撮影された画像から、複数種類の物体を検出するために用いられる。複数種類の物体は、例えば、イヌ、ネコ及びその他の物体である。ニューラルネットワーク３Ａの学習アルゴリズムは特に限定されない。ニューラルネットワーク３Ａの学習において、転移学習を用いてもよい。 The neural network 3A has been learned as described above, and is used, for example, to detect a plurality of types of objects from an image taken by a camera. The plurality of types of objects are, for example, dogs, cats, and other objects. The learning algorithm of the neural network 3A is not particularly limited. Transfer learning may be used in the learning of the neural network 3A.

機械学習装置１００は、図２に示すニューラルネットワーク３Ａが入力された場合、中間層２２〜２５から２つの中間層を選択する。機械学習装置１００は、選択した２つの中間層の類似度に基づいて、選択した２つの中間層のうちいずれか一方を削除することができるか否かを判断する。類似度の算出方法については、後述する。 When the neural network 3A shown in FIG. 2 is input, the machine learning device 100 selects two intermediate layers from the intermediate layers 22-25. The machine learning device 100 determines whether one of the two selected intermediate layers can be deleted based on the similarity between the two selected intermediate layers. A method for calculating the similarity will be described later.

例えば、機械学習装置１００は、ニューラルネットワーク３Ａにおいて中間層２２及び２３を選択し、中間層２２及び２３の類似度を算出する。機械学習装置１００は、算出した類似度に基づいて、中間層２２及び２３のうち中間層２２を削除することを決定する。機械学習装置１００は、ニューラルネットワーク３Ａから中間層２２を削除したネットワークを、ニューラルネットワーク３Ｄとして生成する。ニューラルネットワーク３Ｄは、ニューラルネットワーク３Ａと同等の学習精度を有し、かつ、ニューラルネットワーク３Ａの規模よりも小さい。このように、機械学習装置１００は、求められる学習精度に応じた適切な規模のニューラルネットワーク３Ｄを作成することができる。 For example, the machine learning device 100 selects the intermediate layers 22 and 23 in the neural network 3A, and calculates the similarity between the intermediate layers 22 and 23. The machine learning device 100 determines to delete the intermediate layer 22 among the intermediate layers 22 and 23 based on the calculated similarity. The machine learning device 100 generates a network obtained by deleting the intermediate layer 22 from the neural network 3A as a neural network 3D. The neural network 3D has the same learning accuracy as the neural network 3A and is smaller than the scale of the neural network 3A. As described above, the machine learning device 100 can create a neural network 3D having an appropriate scale according to the required learning accuracy.

｛２．機械学習装置１００の構成｝
図１に示すように、機械学習装置１００は、ネットワーク取得部１１と、中間層選択部１２と、仮出力層追加部１３と、追加学習部１４と、演算部１５と、削除判断部１６と、再構成部１７と、終了判断部１８とを備える。 {2. Configuration of Machine Learning Device 100}
As illustrated in FIG. 1, the machine learning device 100 includes a network acquisition unit 11, an intermediate layer selection unit 12, a temporary output layer addition unit 13, an additional learning unit 14, a calculation unit 15, and a deletion determination unit 16. The reconfiguration unit 17 and the end determination unit 18 are provided.

ネットワーク取得部１１は、学習済みのニューラルネットワーク３Ａを取得する。ネットワーク取得部１１は、ＬＡＮ（Local Area Network）を介して他のコンピュータからニューラルネットワーク３Ａを受信してもよいし、Ｂｌｕ−ｒａｙディスクなどの不揮発性記録媒体から取得してもよい。あるいは、ネットワーク取得部１１は、学習済みのニューラルネットワーク３Ａを自ら生成してもよい。 The network acquisition unit 11 acquires a learned neural network 3A. The network acquisition unit 11 may receive the neural network 3A from another computer via a LAN (Local Area Network), or may acquire it from a nonvolatile recording medium such as a Blu-ray disk. Alternatively, the network acquisition unit 11 may generate the learned neural network 3A itself.

中間層選択部１２は、ネットワーク取得部１１により取得されたニューラルネットワーク３Ａに含まれる中間層２２〜２５の中から、２つの中間層を選択する。 The intermediate layer selection unit 12 selects two intermediate layers from the intermediate layers 22 to 25 included in the neural network 3 A acquired by the network acquisition unit 11.

仮出力層追加部１３は、選択された２つの中間層の各々と接続される仮出力層を、ニューラルネットワーク３Ａに追加する。仮出力層追加部１３は、ニューラルネットワーク３Ａに仮出力層を追加したネットワークを、ニューラルネットワーク３Ｂとして出力する。例えば、中間層選択部１２が中間層２２及び２３を選択した場合、仮出力層追加部１３は、中間層２２が有する各ノードと接続されたノードを有する仮出力層と、中間層２３が有する各ノードと接続されたノードを有する仮出力層とを、ニューラルネットワーク３Ａに追加する。 The temporary output layer adding unit 13 adds a temporary output layer connected to each of the two selected intermediate layers to the neural network 3A. The temporary output layer adding unit 13 outputs a network obtained by adding a temporary output layer to the neural network 3A as a neural network 3B. For example, when the intermediate layer selecting unit 12 selects the intermediate layers 22 and 23, the temporary output layer adding unit 13 includes the temporary output layer having nodes connected to the respective nodes included in the intermediate layer 22, and the intermediate layer 23. A temporary output layer having a node connected to each node is added to the neural network 3A.

追加学習部１４は、学習データ４を用いて、ニューラルネットワーク３Ｂの追加学習を行う。学習データ４は、例えば、イヌ、ネコ及びその他の物体のいずれか１つが撮影された画像である。追加学習部１４は、ニューラルネットワーク３Ｂにおける信号経路のうち、仮出力層の追加により新たに設けられた信号経路の重み計数を更新する。重み計数の更新には、誤差逆伝播法（バックプロパゲーション）が用いられる。追加学習部１４は、ニューラルネットワーク３Ｂにおける信号経路の重み計数を更新したネットワークを、ニューラルネットワーク３Ｃとして出力する。 The additional learning unit 14 uses the learning data 4 to perform additional learning of the neural network 3B. The learning data 4 is, for example, an image obtained by shooting any one of a dog, a cat, and other objects. The additional learning unit 14 updates the weighting coefficient of the signal path newly provided by adding the temporary output layer among the signal paths in the neural network 3B. An error back-propagation method (back propagation) is used for updating the weighting factor. The additional learning unit 14 outputs, as the neural network 3C, a network in which the signal path weight coefficient in the neural network 3B is updated.

演算部１５は、ニューラルネットワーク３Ｃにテストデータ５を入力して、ニューラルネットワーク３Ｃを用いた演算を実行する。テストデータ５は、学習データ４と同様に、イヌ、ネコ及びその他の物体のいずれかが撮影された画像である。ニューラルネットワーク３Ｃは、演算結果として、出力層２６が有する出力ノード２６ａ〜２６ｃから通常出力値４６を出力する。また、ニューラルネットワーク３Ｃは、一方の仮出力層から仮出力値を出力し、他方の仮出力層から仮出力値を出力する。 The calculation unit 15 inputs the test data 5 to the neural network 3C and executes a calculation using the neural network 3C. Similarly to the learning data 4, the test data 5 is an image in which any one of a dog, a cat, and other objects is photographed. The neural network 3C outputs a normal output value 46 from the output nodes 26a to 26c included in the output layer 26 as a calculation result. The neural network 3C outputs a temporary output value from one temporary output layer and outputs a temporary output value from the other temporary output layer.

削除判断部１６は、仮出力値に基づいて、中間層選択部１２により選択された２つの中間層の類似度を算出する。類似度の算出方法については、後述する。削除判断部１６は、算出した類似度に基づいて、中間層選択部１２により選択された２つの中間層のうちいずれか一方を削除することができるか否かを判断する。 The deletion determination unit 16 calculates the similarity between the two intermediate layers selected by the intermediate layer selection unit 12 based on the temporary output value. A method for calculating the similarity will be described later. The deletion determination unit 16 determines whether or not one of the two intermediate layers selected by the intermediate layer selection unit 12 can be deleted based on the calculated similarity.

再構成部１７は、削除判断部１６が２つの中間層のうちいずれか一方の中間層を削除することができると判断した場合、いずれか一方の中間層をニューラルネットワーク３Ａから削除することにより、ニューラルネットワーク３Ｄを生成する。再構成部１７は、生成したニューラルネットワーク３Ｄの再学習を行う。ニューラルネットワーク３Ｄの再学習において、再構成部１７は、ニューラルネットワーク３Ａの学習に用いられたアルゴリズムを使用する。 When the reconfiguration unit 17 determines that the deletion determination unit 16 can delete any one of the two intermediate layers, the reconfiguration unit 17 deletes either one of the intermediate layers from the neural network 3A. A neural network 3D is generated. The reconfiguration unit 17 performs relearning of the generated neural network 3D. In the relearning of the neural network 3D, the reconfiguration unit 17 uses an algorithm used for learning of the neural network 3A.

終了判断部１８は、学習済みのニューラルネットワーク３Ｄから中間層をさらに削除するか否かを判断する。具体的には、終了判断部１８は、ニューラルネットワーク３Ａが有する中間層の数から、ニューラルネットワーク３Ｄが有する中間層の数を減算することにより、ニューラルネットワーク３Ａから削除された中間層の数を決定する。 The end determination unit 18 determines whether to further delete the intermediate layer from the learned neural network 3D. Specifically, the end determination unit 18 determines the number of intermediate layers deleted from the neural network 3A by subtracting the number of intermediate layers included in the neural network 3D from the number of intermediate layers included in the neural network 3A. To do.

削除された中間層の数が予め設定された終了基準値に達していない場合、終了判断部１８は、ニューラルネットワーク３Ｄから中間層をさらに削除することができると判断する。終了判断部１８は、ニューラルネットワーク３Ｄを中間層選択部１２及び仮出力層追加部１３に出力する。これにより、機械学習装置１００は、ニューラルネットワーク３Ｄから中間層を削除することができるか否かを判断する処理を繰り返す。 When the number of deleted intermediate layers does not reach the preset end reference value, the end determination unit 18 determines that the intermediate layers can be further deleted from the neural network 3D. The end determination unit 18 outputs the neural network 3D to the intermediate layer selection unit 12 and the temporary output layer addition unit 13. Thereby, the machine learning device 100 repeats the process of determining whether or not the intermediate layer can be deleted from the neural network 3D.

削除された中間層の数が終了基準値に達している場合、終了判断部１８は、ニューラルネットワーク３Ｄから中間層をさらに削除することができないと判断する。この場合、終了判断部１８は、ニューラルネットワーク３Ｄを機械学習装置１００の外部に出力する。機械学習装置１００の外部に出力されたニューラルネットワーク３Ｄは、イヌ、ネコ及びその他の物体を識別する物体検出装置（図示省略）に用いられる。 When the number of deleted intermediate layers reaches the end reference value, the end determination unit 18 determines that the intermediate layers cannot be further deleted from the neural network 3D. In this case, the end determination unit 18 outputs the neural network 3D to the outside of the machine learning device 100. The neural network 3D output to the outside of the machine learning device 100 is used for an object detection device (not shown) that identifies dogs, cats, and other objects.

｛３．機械学習装置１００の動作｝
以下、図２に示すニューラルネットワーク３Ａが機械学習装置１００に入力される場合を例にして、機械学習装置１００の動作について詳しく説明する。図３は、機械学習装置１００の動作を示すフローチャートである。 {3. Operation of Machine Learning Device 100}
Hereinafter, the operation of the machine learning device 100 will be described in detail by taking as an example the case where the neural network 3A shown in FIG. 2 is input to the machine learning device 100. FIG. 3 is a flowchart showing the operation of the machine learning device 100.

｛３．１．中間層２２の削除｝
最初に、図２に示すニューラルネットワーク３Ａから中間層２２を削除する場合を例として、機械学習装置１００の動作を説明する。 {3.1. Delete intermediate layer 22}
First, the operation of the machine learning device 100 will be described by taking as an example the case where the intermediate layer 22 is deleted from the neural network 3A shown in FIG.

｛３．１．１．ニューラルネットワーク３Ａの取得｝
図３に示すように、ネットワーク取得部１１は、機械学習装置１００に入力されるニューラルネットワーク３Ａを取得する（ステップＳ１）。 {3.1.1. Acquisition of neural network 3A}
As shown in FIG. 3, the network acquisition unit 11 acquires the neural network 3A input to the machine learning device 100 (step S1).

ニューラルネットワーク３Ａは、各層に含まれるノードを定義するデータと、２つのノードを接続する信号経路の重み計数とを含むデータである。信号経路は、２つの層が互いに隣接している場合において、一方の層が有する１つのノードと、他方の層が有するノードとを接続する。ニューラルネットワーク３Ａが学習済みであるため、ニューラルネットワーク３Ａに含まれる重み計数は、イヌ、ネコ及びその他の物体を識別するために既に調整されている。 The neural network 3A is data including data defining nodes included in each layer and a weighting factor of a signal path connecting the two nodes. When two layers are adjacent to each other, the signal path connects one node included in one layer and a node included in the other layer. Since the neural network 3A has been learned, the weight counts included in the neural network 3A have already been adjusted to identify dogs, cats and other objects.

図２に示すニューラルネットワーク３Ａは、６層のパーセプトロンであるが、ニューラルネットワーク３Ａの構成は、これに限定されない。ニューラルネットワーク３Ａは、少なくとも２つの中間層を備えていればよい。つまり、ニューラルネットワーク３Ａが備える層の数は、４以上であればよい。ニューラルネットワーク３Ａの各層が備えるノードの数は、特に限定されない。 The neural network 3A shown in FIG. 2 is a six-layer perceptron, but the configuration of the neural network 3A is not limited to this. The neural network 3A only needs to include at least two intermediate layers. That is, the number of layers provided in the neural network 3A may be four or more. The number of nodes included in each layer of the neural network 3A is not particularly limited.

図２に示すニューラルネットワーク３Ａにおいて、重み計数が設定される信号経路の「区間」を定義する。ここで、「区間」は、ニューラルネットワーク３Ａにおける信号経路の位置を示すための便宜的な呼称である。具体的には、区間Ａ１は、入力層２１から中間層２２までの区間である。区間Ｂ１は、中間層２２から中間層２３までの区間である。区間Ｃ１は、中間層２３から中間層２４までの区間である。区間Ｄ１は、中間層２４から中間層２５までの区間である。区間Ｅ１は、中間層２５から出力層２６までの区間である。次に、ニューラルネットワーク３Ａにおける方向を定義する。ニューラルネットワーク３Ａに含まれる任意の中間層に着目する、着目した中間層を基準にして、入力層２１が位置する方向を「下方向」と定義し、出力層２６が位置する方向を「上方向」と定義する。 In the neural network 3 A shown in FIG. 2, a “section” of a signal path in which a weight count is set is defined. Here, “section” is a convenient name for indicating the position of the signal path in the neural network 3A. Specifically, the section A 1 is a section from the input layer 21 to the intermediate layer 22. The section B1 is a section from the intermediate layer 22 to the intermediate layer 23. The section C 1 is a section from the intermediate layer 23 to the intermediate layer 24. The section D1 is a section from the intermediate layer 24 to the intermediate layer 25. The section E1 is a section from the intermediate layer 25 to the output layer 26. Next, directions in the neural network 3A are defined. Focusing on an arbitrary intermediate layer included in the neural network 3A, with reference to the target intermediate layer, the direction in which the input layer 21 is located is defined as "downward", and the direction in which the output layer 26 is located is defined as "upward Is defined.

｛３．１．２．中間層の選択｝
中間層選択部１２は、ニューラルネットワーク３Ａが有する中間層２２〜２５の中から、２つの中間層を選択する（ステップＳ２）。例えば、中間層選択部１２は、中間層２２及び２３を選択する
中間層選択部１２は、互いに隣接する２つの中間層を選択することが望ましい。以下、その理由を説明する。上述のように、機械学習装置１００は、ニューラルネットワーク３Ａが有する中間層２２〜２５の中で、類似する機能を有する２つの中間層を特定することができた場合、この２つの中間層のいずれか一方をニューラルネットワーク３Ａから削除する。 {3.1.2. Middle layer selection}
The intermediate layer selection unit 12 selects two intermediate layers from the intermediate layers 22 to 25 included in the neural network 3A (step S2). For example, the intermediate layer selection unit 12 selects the intermediate layers 22 and 23. The intermediate layer selection unit 12 preferably selects two intermediate layers adjacent to each other. The reason will be described below. As described above, when the machine learning device 100 can identify two intermediate layers having similar functions among the intermediate layers 22 to 25 included in the neural network 3A, any one of the two intermediate layers may be specified. One of them is deleted from the neural network 3A.

ニューラルネットワーク３Ａにおいて、２つの中間層の距離が近いほど、この２つの中間層の機能が近づくと考えられる。ニューラルネットワーク３Ａの用途によって、各中間層の機能も変化するが、入力層２１に近い中間層は、抽象的な概念を分類する機能を有し、出力層２６に近い中間層は、具体的な概念を分類する機能を有すると理解することができる。 In the neural network 3A, the closer the distance between the two intermediate layers is, the closer the functions of the two intermediate layers are. The functions of each intermediate layer also change depending on the use of the neural network 3A. However, the intermediate layer close to the input layer 21 has a function of classifying abstract concepts, and the intermediate layer close to the output layer 26 is specific. It can be understood that it has the function of classifying concepts.

例えば、ニューラルネットワーク３Ａが、撮影された画像から、イヌ、ネコ及びその他の物体のいずれかを検出する機能を有する場合を考える。この場合、入力層２１に近い中間層は、画像中の物体が動物であるか否かを判別し、出力層２６に近い中間層は、画像中の動物がイヌであるかネコであるかを判別すると解釈することができる。 For example, consider a case where the neural network 3A has a function of detecting any one of a dog, a cat, and other objects from a photographed image. In this case, the intermediate layer close to the input layer 21 determines whether the object in the image is an animal, and the intermediate layer close to the output layer 26 determines whether the animal in the image is a dog or a cat. It can be interpreted as discriminating.

従って、ニューラルネットワーク３Ａにおいて、隣接する２つの中間層は、互いに類似する機能を有していると予想される。このため、中間層選択部１２は、隣接する２つの中間層を選択することが望ましい。ただし、選択される２つの中間層は、互いに隣接していなくてもよい。 Accordingly, in the neural network 3A, two adjacent intermediate layers are expected to have functions similar to each other. For this reason, it is desirable for the intermediate layer selection unit 12 to select two adjacent intermediate layers. However, the two selected intermediate layers may not be adjacent to each other.

｛３．１．３．仮出力層の追加｝
中間層選択部１２は、ステップＳ２において選択した２つの中間層を特定する中間層選択データを、仮出力層追加部１３に出力する。仮出力層追加部１３は、中間層選択データに基づいて、ニューラルネットワーク３Ａに２つの仮出力層を追加する（ステップＳ３）。 {3.1.3. Add temporary output layer}
The intermediate layer selection unit 12 outputs intermediate layer selection data for specifying the two intermediate layers selected in step S 2 to the temporary output layer addition unit 13. The temporary output layer adding unit 13 adds two temporary output layers to the neural network 3A based on the intermediate layer selection data (step S3).

図４は、仮出力層を備えるニューラルネットワーク３Ｂを示す図である。図４に示すニューラルネットワーク３Ｂは、図２に示すニューラルネットワーク３Ａに仮出力層３２及び３３を追加することにより生成される。図４において、中間層２４及び２５と出力層２６の表示を省略している。 FIG. 4 is a diagram showing a neural network 3B having a temporary output layer. A neural network 3B shown in FIG. 4 is generated by adding temporary output layers 32 and 33 to the neural network 3A shown in FIG. In FIG. 4, the intermediate layers 24 and 25 and the output layer 26 are not shown.

以下、ステップＳ２（図３参照）において、中間層２２及び２３が選択された場合を例にして、仮出力層の追加（ステップＳ３）について詳しく説明する。最初に、仮出力層追加部１３は、選択された中間層２２及び２３の中で、上方向に位置する中間層２３を特定する。仮出力層追加部１３は、特定した中間層２３に対応する仮出力層３３を、ニューラルネットワーク３Ａに追加する。 Hereinafter, the provisional output layer addition (step S3) will be described in detail by taking as an example the case where the intermediate layers 22 and 23 are selected in step S2 (see FIG. 3). First, the temporary output layer adding unit 13 identifies the intermediate layer 23 positioned in the upward direction among the selected intermediate layers 22 and 23. The temporary output layer adding unit 13 adds a temporary output layer 33 corresponding to the identified intermediate layer 23 to the neural network 3A.

具体的には、仮出力層追加部１３は、仮出力ノード３３ａ〜３３ｃを有する仮出力層３３を生成する。仮出力層追加部１３は、仮出力ノード３３ａ〜３３ｃの各々を、中間層２３が有するノード２３ａ〜２３ｄと接続することにより、仮出力層３３をニューラルネットワーク３Ａに追加する。仮出力層３３が有する仮出力ノードの数は、出力層２６が有する出力ノードの数と同じとなるように設定される。この理由については、後述する。 Specifically, the temporary output layer adding unit 13 generates a temporary output layer 33 having temporary output nodes 33a to 33c. The temporary output layer adding unit 13 adds the temporary output layer 33 to the neural network 3A by connecting each of the temporary output nodes 33a to 33c to the nodes 23a to 23d of the intermediate layer 23. The number of temporary output nodes included in the temporary output layer 33 is set to be the same as the number of output nodes included in the output layer 26. The reason for this will be described later.

仮出力層追加部１３は、仮出力層３３の追加に伴って、中間層２３と仮出力層３３との間の区間Ｇ１における各信号経路の重み計数の初期値を設定する。重み計数の初期値は、例えば、予め定められた分散の範囲内に収まるようにランダムに設定される。 The temporary output layer adding unit 13 sets the initial value of the weight count of each signal path in the section G 1 between the intermediate layer 23 and the temporary output layer 33 with the addition of the temporary output layer 33. The initial value of the weight count is set at random, for example, so as to be within a predetermined dispersion range.

次に、仮出力層追加部１３は、選択された中間層２２及び２３の中で、下方向に位置する中間層２２を特定する。仮出力層追加部１３は、特定した中間層２２に対応する仮出力層３２を、ニューラルネットワーク３Ａに追加する。 Next, the temporary output layer adding unit 13 identifies the intermediate layer 22 positioned in the downward direction among the selected intermediate layers 22 and 23. The temporary output layer adding unit 13 adds the temporary output layer 32 corresponding to the identified intermediate layer 22 to the neural network 3A.

具体的には、仮出力層追加部１３は、仮出力ノード３２ａ〜３２ｃを有する仮出力層３２を生成する。仮出力層追加部１３は、仮出力ノード３２ａ〜３２ｃの各々を、中間層２２が有するノード２２ａ〜２２ｄと接続することにより、仮出力層３２をニューラルネットワーク３Ａに追加する。仮出力層３２が有する仮出力ノードの数は、出力層２６が有する出力ノードの数に一致する。 Specifically, the temporary output layer adding unit 13 generates a temporary output layer 32 having temporary output nodes 32a to 32c. The temporary output layer adding unit 13 adds the temporary output layer 32 to the neural network 3 A by connecting each of the temporary output nodes 32 a to 32 c to the nodes 22 a to 22 d of the intermediate layer 22. The number of temporary output nodes included in the temporary output layer 32 matches the number of output nodes included in the output layer 26.

仮出力層追加部１３は、仮出力層３２の追加に伴って、中間層２２と仮出力層３２との間の区間Ｆ１における各信号経路の重み計数の初期値を設定する。重み計数の初期値は、予め定められた分散の範囲内に収まるようにランダムに設定される。 The temporary output layer adding unit 13 sets the initial value of the weight count of each signal path in the section F 1 between the intermediate layer 22 and the temporary output layer 32 with the addition of the temporary output layer 32. The initial value of the weight count is randomly set so as to be within a predetermined dispersion range.

仮出力層３２が有する仮出力ノード３２ａ〜３２ｃの各々は、出力ノード２６ａ〜２６ｃのいずれか１つに対応する。仮出力層３３が有する仮出力ノード３３ａ〜３３ｃについても同様である。具体的には、仮出力ノード３２ａ及び３３ａは、出力ノード２６ａに対応する。仮出力ノード３２ｂ及び３３ｂは、出力ノード２６ｂに対応する。仮出力ノード３２ｃ及び３３ｃは、出力ノード２６ｃに対応する。 Each of the temporary output nodes 32a to 32c included in the temporary output layer 32 corresponds to any one of the output nodes 26a to 26c. The same applies to the temporary output nodes 33a to 33c included in the temporary output layer 33. Specifically, the temporary output nodes 32a and 33a correspond to the output node 26a. The temporary output nodes 32b and 33b correspond to the output node 26b. The temporary output nodes 32c and 33c correspond to the output node 26c.

このようにして、仮出力層３２及び３３が、ニューラルネットワーク３Ａに追加される。仮出力層追加部１３は、ニューラルネットワーク３Ａに仮出力層３２及び３３を追加したネットワークを、ニューラルネットワーク３Ｂとして出力する。 In this way, the temporary output layers 32 and 33 are added to the neural network 3A. The temporary output layer adding unit 13 outputs a network obtained by adding temporary output layers 32 and 33 to the neural network 3A as a neural network 3B.

｛３．１．４．追加学習｝
再び、図３を参照する。追加学習部１４は、仮出力層追加部１３が出力したニューラルネットワーク３Ｂを入力する。追加学習部１４は、ニューラルネットワーク３Ｂの追加学習を実行する（ステップＳ４）。 {3.1.4. Additional learning}
Reference is again made to FIG. The additional learning unit 14 inputs the neural network 3B output from the temporary output layer adding unit 13. The additional learning unit 14 performs additional learning of the neural network 3B (step S4).

図４を参照しながら、追加学習（ステップＳ４）について詳しく説明する。追加学習部１４は、区間Ｆ１及びＧ１における信号経路の重み計数のみを更新する。区間Ａ１、Ｂ１、Ｃ１、Ｄ１及びＥ１における信号経路の重み計数は、更新されない。つまり、追加学習（ステップＳ４）は、ステップＳ２で選択された中間層と、選択された中間層と接続された仮出力層との間の区間における信号経路の重み計数を更新する処理である。 The additional learning (step S4) will be described in detail with reference to FIG. The additional learning unit 14 updates only the weighting coefficient of the signal path in the sections F1 and G1. The weighting factor of the signal path in the sections A1, B1, C1, D1, and E1 is not updated. That is, the additional learning (step S4) is a process of updating the signal path weight count in the section between the intermediate layer selected in step S2 and the temporary output layer connected to the selected intermediate layer.

追加学習は、誤差逆伝播法を用いて実行される。具体的には、追加学習部１４は、学習データ４をニューラルネットワーク３Ｂに入力し、ニューラルネットワーク３Ｂを用いた演算を実行する。学習データ４は、例えば、イヌ、ネコ、及びその他物体のいずれかが撮影された画像データである。 Additional learning is performed using the error back propagation method. Specifically, the additional learning unit 14 inputs the learning data 4 to the neural network 3B, and executes a calculation using the neural network 3B. The learning data 4 is, for example, image data obtained by photographing any of a dog, a cat, and other objects.

追加学習部１４は、学習データ４を入力としたニューラルネットワーク３Ｂの演算により、仮出力値４２ａ〜４２ｃ及び４３ａ〜４３ｃを取得する。 The additional learning unit 14 obtains temporary output values 42a to 42c and 43a to 43c by calculation of the neural network 3B with the learning data 4 as an input.

仮出力値４２ａ〜４２ｃは、学習データ４又は後述するテストデータ５をニューラルネットワーク３Ｂに入力した場合において、ニューラルネットワーク３Ｂの演算により仮出力ノード３２ａ〜３２ｃから出力される出力値である。以下、仮出力値４２ａ〜４２ｃを総称する場合、仮出力値４２と記載する。仮出力値４３ａ〜４３ｃは、ニューラルネットワーク３Ｂの演算により仮出力ノード３３ａ〜３３ｃから出力される出力値である。以下、仮出力値４３ａ〜４３ｃを総称する場合、仮出力値４２と記載する。以下、学習データ４をニューラルネットワーク３Ｂに入力することにより得られる仮出力値４２及び４３を、追加学習用の仮出力値と記載する。 The temporary output values 42a to 42c are output values output from the temporary output nodes 32a to 32c by the calculation of the neural network 3B when learning data 4 or test data 5 described later is input to the neural network 3B. Hereinafter, the temporary output values 42 a to 42 c are collectively referred to as the temporary output value 42. The temporary output values 43a to 43c are output values output from the temporary output nodes 33a to 33c by the calculation of the neural network 3B. Hereinafter, the temporary output values 43a to 43c are collectively referred to as the temporary output value 42. Hereinafter, the temporary output values 42 and 43 obtained by inputting the learning data 4 to the neural network 3B are described as temporary output values for additional learning.

追加学習部１４は、正解値から追加学習用の仮出力値４２を減算することにより、仮出力ノード３２ａ〜３２ｃの各々に対応する誤差値を算出する。正解値（教師信号）は、学習データ４をニューラルネットワーク３Ｂに入力した場合において、出力ノード２６ａ〜２６ｃの各々が出力すべき値として予め決定された値である。 The additional learning unit 14 calculates an error value corresponding to each of the temporary output nodes 32a to 32c by subtracting the temporary output value 42 for additional learning from the correct value. The correct answer value (teacher signal) is a value determined in advance as a value to be output by each of the output nodes 26a to 26c when the learning data 4 is input to the neural network 3B.

例えば、仮出力ノード３２ａに対応する誤差値は、出力ノード２６ａの正解値から、追加学習用の仮出力値４２ａを減算することにより得られる。追加学習部１４は、仮出力ノード３２ａ〜３２ｃの各々に対応する誤差値を仮出力ノード３２ａ〜３２ｃに入力し、入力した誤差値に基づいて区間Ｆ１における信号経路の重み計数を更新する。 For example, the error value corresponding to the temporary output node 32a is obtained by subtracting the temporary output value 42a for additional learning from the correct value of the output node 26a. The additional learning unit 14 inputs an error value corresponding to each of the temporary output nodes 32a to 32c to the temporary output nodes 32a to 32c, and updates the weighting coefficient of the signal path in the section F1 based on the input error value.

追加学習部１４は、同様に、正解値から追加学習用の仮出力値４３を減算することにより、仮出力ノード３３ａ〜３３ｃの各々に対応する誤差値を算出する。追加学習部１４は、仮出力ノード３３ａ〜３３ｃの各々に対応する誤差値を仮出力ノード３３ａ〜３３ｃに入力し、入力した誤差値に基づいて区間Ｇ１における信号経路の重み計数を更新する。 Similarly, the additional learning unit 14 subtracts the temporary output value 43 for additional learning from the correct value, thereby calculating an error value corresponding to each of the temporary output nodes 33a to 33c. The additional learning unit 14 inputs an error value corresponding to each of the temporary output nodes 33a to 33c to the temporary output nodes 33a to 33c, and updates the weighting coefficient of the signal path in the section G1 based on the input error value.

区間Ｆ１及びＧ１における信号経路の重み計数のみが、追加学習において更新される理由を説明する。追加学習部１４は、追加学習（ステップＳ４）を実行する際に、仮出力ノード３２ａ〜３２ｃの各々に入力される誤差値に基づいて、区間Ａ１における信号経路の重み計数を更新することが可能である。また、追加学習部１４は、仮出力ノード３３ａ〜３３ｃの各々に入力される誤差値に基づいて、区間Ａ１及びＢ１の各々における信号経路の重み計数を更新することが可能である。 The reason why only the weighting factor of the signal path in the sections F1 and G1 is updated in the additional learning will be described. When performing additional learning (step S4), the additional learning unit 14 can update the weighting factor of the signal path in the section A1 based on the error value input to each of the temporary output nodes 32a to 32c. It is. Further, the additional learning unit 14 can update the weighting factor of the signal path in each of the sections A1 and B1 based on the error value input to each of the temporary output nodes 33a to 33c.

しかし、ニューラルネットワーク３Ａは、学習済みであるため、区間Ａ１及びＢ１の各々における信号経路の重み計数は、既に調整されている。追加学習（ステップＳ４）において、区間Ａ１及びＢ１の各々における信号経路の重み計数を更新した場合、ニューラルネットワーク３Ａの学習精度が低下する可能性がある。このため、追加学習部１４は、仮出力層３２及び３３の追加により設けられた区間Ｆ１及びＧ１における信号経路の重み計数のみを更新する。追加学習の完了したニューラルネットワーク３Ｂは、ニューラルネットワーク３Ｃとして演算部１５に出力される。 However, since the neural network 3A has already been learned, the weighting factor of the signal path in each of the sections A1 and B1 has already been adjusted. In the additional learning (step S4), when the signal path weight count in each of the sections A1 and B1 is updated, the learning accuracy of the neural network 3A may be lowered. For this reason, the additional learning unit 14 updates only the weighting coefficient of the signal path in the sections F1 and G1 provided by adding the temporary output layers 32 and 33. The neural network 3B for which additional learning has been completed is output to the arithmetic unit 15 as the neural network 3C.

｛３．１．５．類似度の算出｝
演算部１５は、追加学習部１４が出力したニューラルネットワーク３Ｃを入力する。演算部１５は、テストデータ５をニューラルネットワーク３Ｃに入力し、ニューラルネットワーク３Ｃを用いた演算を実行する（ステップＳ５）。テストデータ５は、本実施の形態では、イヌ、ネコ及びその他物体のいずれかが撮影された画像データである。図４に示すように、ニューラルネットワーク３Ｃは、テストデータ５を用いた演算結果として、仮出力ノード３２ａ〜３２ｃから仮出力値４２を出力し、仮出力ノード３３ａ〜３３ｃから仮出力値４３を出力する。 {3.1.5. Calculation of similarity}
The calculation unit 15 receives the neural network 3C output from the additional learning unit 14. The calculation unit 15 inputs the test data 5 to the neural network 3C, and executes a calculation using the neural network 3C (step S5). In the present embodiment, the test data 5 is image data obtained by photographing any of a dog, a cat, and other objects. As shown in FIG. 4, the neural network 3C outputs the temporary output value 42 from the temporary output nodes 32a to 32c and outputs the temporary output value 43 from the temporary output nodes 33a to 33c as the calculation results using the test data 5. To do.

削除判断部１６は、仮出力値４２及び４３を演算部１５から取得する。削除判断部１６は、取得した仮出力値４２及び４３を用いて、中間層２２と中間層２３との類似度を算出する（ステップＳ６）。類似度は、２つの確率密度分布の類似性を示すパラメータである、バタチャリヤ距離を計算することにより得られる。具体的には、類似度は、下記式（１）を用いて算出される。 The deletion determination unit 16 acquires the temporary output values 42 and 43 from the calculation unit 15. The deletion determining unit 16 calculates the similarity between the intermediate layer 22 and the intermediate layer 23 using the acquired temporary output values 42 and 43 (step S6). The degree of similarity is obtained by calculating a batcha rear distance, which is a parameter indicating the similarity between two probability density distributions. Specifically, the similarity is calculated using the following formula (1).

式（１）において、Ｓは、類似度である。ｐ（ｉ）は、２つの仮出力層のうち、一方の仮出力層における仮出力ノードから出力される仮出力値の確率密度分布である。ｑ（ｉ）は、他方の仮出力層における仮出力ノードから出力される仮出力値の確率密度分布である。ｉは、各仮出力層におけるｉ番目のノードを示す。 In equation (1), S is the similarity. p (i) is a probability density distribution of temporary output values output from the temporary output node in one temporary output layer of the two temporary output layers. q (i) is a probability density distribution of temporary output values output from the temporary output node in the other temporary output layer. i indicates the i-th node in each temporary output layer.

図４に示す例の場合、ｐ（ｉ）を、仮出力値４２の確率密度分布に割り当て、ｑ（ｉ）を、仮出力値４３の確率密度分布に割り当てることができる。削除判断部１６は、仮出力値４２ａ〜４２ｃの各々の値を、０以上１以下の数値に正規化し、正規化された仮出力値４２ａ〜４２ｃを用いて確率密度分布ｐ（ｉ）を作成する。仮出力値４３の確率密度分布ｑ（ｉ）も、同様に作成される。 In the case of the example shown in FIG. 4, p (i) can be assigned to the probability density distribution of the temporary output value 42, and q (i) can be assigned to the probability density distribution of the temporary output value 43. The deletion determination unit 16 normalizes each of the temporary output values 42a to 42c to a numerical value of 0 or more and 1 or less, and creates a probability density distribution p (i) using the normalized temporary output values 42a to 42c. To do. The probability density distribution q (i) of the temporary output value 43 is created in the same manner.

式（１）により算出される類似度は、０以上１以下の数値である。仮出力値４２と仮出力値４３との差が小さくなるにつれて、類似度は、１に近づく。１つのテストデータ５に対して、１つの類似度が算出される。機械学習装置１００は、ステップＳ５及びＳ６を繰り返すことにより、複数のテストデータ５の各々に対応する類似度を算出する。 The similarity calculated by the expression (1) is a numerical value of 0 or more and 1 or less. As the difference between the temporary output value 42 and the temporary output value 43 decreases, the degree of similarity approaches 1. One similarity is calculated for one test data 5. The machine learning device 100 calculates the similarity corresponding to each of the plurality of test data 5 by repeating steps S5 and S6.

なお、ステップＳ５において、ニューラルネットワーク３Ｃは、テストデータ５の演算結果として、通常出力値４６を出力する。しかし、中間層２２及び２３がステップＳ２において選択されている場合、通常出力値４６は、類似度の算出に用いられない。通常出力値４６が類似度の算出に用いられる例については、後述する。 In step S5, the neural network 3C outputs a normal output value 46 as a calculation result of the test data 5. However, when the intermediate layers 22 and 23 are selected in step S2, the normal output value 46 is not used for calculating the similarity. An example in which the normal output value 46 is used for calculating the similarity will be described later.

｛３．１．６．削除判断｝
上述のように、演算部１５が、所定数のテストデータ５をニューラルネットワーク３Ｃに入力して、ニューラルネットワーク３Ｃを演算することにより、削除判断部１６は、所定数の類似度を算出する（ステップＳ５及びＳ６）。 {3.1.6. Deletion judgment}
As described above, the calculation unit 15 inputs a predetermined number of test data 5 to the neural network 3C and calculates the neural network 3C, so that the deletion determination unit 16 calculates a predetermined number of similarities (steps) S5 and S6).

所定数のテストデータ５を用いたニューラルネットワーク３Ｃの演算が終了した場合、削除判断部１６は、所定数の類似度の平均値を算出する。削除判断部１６は、算出した平均値を予め設定された類似基準値と比較することにより、中間層２２及び２３のいずれか一方を削除することができるか否かを判断する（ステップＳ７）。 When the calculation of the neural network 3C using the predetermined number of test data 5 is completed, the deletion determining unit 16 calculates an average value of the predetermined number of similarities. The deletion determining unit 16 determines whether any one of the intermediate layers 22 and 23 can be deleted by comparing the calculated average value with a preset similarity reference value (step S7).

上述のように、仮出力値４２と仮出力値４３との差が小さくなるにつれて、類似度は、１に近づく。類似度の平均値が１に近づくにつれて、ニューラルネットワーク３Ａにおける中間層２２及び２３の機能が類似していると判断される。 As described above, the similarity degree approaches 1 as the difference between the temporary output value 42 and the temporary output value 43 decreases. As the average similarity value approaches 1, it is determined that the functions of the intermediate layers 22 and 23 in the neural network 3A are similar.

削除判断部１６は、類似度の平均値が類似基準値よりも大きい場合、中間層２２の機能が中間層２３の機能に類似していると判断する。この場合、削除判断部１６は、中間層２２及び２３のいずれか一方を削除することができると判断する（ステップＳ７においてＹｅｓ）。削除判断部１６は、中間層２２及び２３のうち、下方向に位置する中間層２２を削除することを決定する。中間層２２は、中間層２３よりも出力層２６から遠い位置にある。このため、中間層２２は、中間層２３よりも、出力層２６が有する出力ノード２６ａ〜２６ｃから出力される通常出力値４６に対して与える影響が小さい。このため、選択された２つの中間層２２及び２３のうち、下方向に位置する中間層２２が削除される。 The deletion determining unit 16 determines that the function of the intermediate layer 22 is similar to the function of the intermediate layer 23 when the average value of the similarities is larger than the similarity reference value. In this case, the deletion determination unit 16 determines that any one of the intermediate layers 22 and 23 can be deleted (Yes in step S7). The deletion determination unit 16 determines to delete the intermediate layer 22 positioned in the downward direction among the intermediate layers 22 and 23. The intermediate layer 22 is located farther from the output layer 26 than the intermediate layer 23. For this reason, the influence of the intermediate layer 22 on the normal output value 46 output from the output nodes 26 a to 26 c of the output layer 26 is smaller than that of the intermediate layer 23. For this reason, the intermediate layer 22 located in the lower direction is deleted from the two selected intermediate layers 22 and 23.

ここで、機械学習装置１００が、２つの仮出力層から出力される仮出力値から類似度を算出し、算出した類似度を使用して２つの中間層が類似しているか否かを判断する理由を説明する。２つの中間層が類似しているか否かを判断する際に、一方の中間層からの出力値を、他方の中間層からの出力値と比較することが考えられる。しかし、一方の中間層からの出力値を、他方の中間層からの出力値と比較しても、２つの中間層が類似しているか否かを判断することは困難である。困難である理由として、２つの理由が挙げられる。 Here, the machine learning device 100 calculates the similarity from the temporary output values output from the two temporary output layers, and determines whether the two intermediate layers are similar using the calculated similarity. Explain why. In determining whether two intermediate layers are similar, it is conceivable to compare the output value from one intermediate layer with the output value from the other intermediate layer. However, even if the output value from one intermediate layer is compared with the output value from the other intermediate layer, it is difficult to determine whether or not the two intermediate layers are similar. There are two reasons why it is difficult.

第１の理由は、選択した２つの中間層の各々が有するノード数が同じであるとは限らないためである。図４に示す例では、ステップＳ２で選択された中間層２２及び３３の各々が有する数が同じとなっている。しかし、選択した２つの中間層の各々が有するノード数が異なる場合、一方の中間層が有するノードからの出力値を、他方の中間層が有するノードからの出力値と単純に比較することができない。 The first reason is that the number of nodes included in each of the two selected intermediate layers is not necessarily the same. In the example shown in FIG. 4, the number of each of the intermediate layers 22 and 33 selected in step S 2 is the same. However, if each of the two selected intermediate layers has a different number of nodes, the output value from the node of one intermediate layer cannot be simply compared with the output value from the node of the other intermediate layer. .

第２の理由は、選択した２つの中間層の機能が類似しており、２つの中間層が有するノードの数が同じである場合であっても、一方の中間層が有するノードからの出力値が、他方の中間層が有するノードからの出力値と類似する値となるとは限らないためである。以下、詳しく説明する。 The second reason is that even if the functions of the two selected intermediate layers are similar and the number of nodes included in the two intermediate layers is the same, the output value from the node included in one of the intermediate layers This is because the value is not always similar to the output value from the node of the other intermediate layer. This will be described in detail below.

図４に示すニューラルネットワーク３Ａの学習時に誤差逆伝播法を用いる場合、区間Ａ１及びＢ１における信号経路の重み係数の初期値は、それぞれ所定の分散の範囲内に収まるようにランダムに設定される。また、区間Ａ１を下方向に伝播する誤差は、区間Ｂ１を下方向に伝播する誤差よりも小さいため、区間Ａ１における信号経路の重み係数の更新量は、区間Ｂ１における信号経路の重み係数の更新量と異なる。 When the error back-propagation method is used at the time of learning of the neural network 3A shown in FIG. 4, the initial values of the signal path weight coefficients in the sections A1 and B1 are randomly set so as to fall within a predetermined dispersion range. Further, since the error propagating downward in the section A1 is smaller than the error propagating downward in the section B1, the update amount of the weighting coefficient of the signal path in the section A1 is the update of the weighting coefficient of the signal path in the section B1. Different from the amount.

この結果、中間層２２及び２３が類似する機能を有する場合であっても、区間Ａ１及びＢ１において対応関係にある２つの信号経路の重み係数は、同じとはならない。例えば、図４において、区間Ａ１におけるノード２１ｂとノード２２ａとの間の信号経路は、区間Ｂ１におけるノード２２ｂとノード２３ａとの信号経路に対応する。しかし、これら２つの信号経路の重み係数は、同じとはならない。ノード２２ａから出力される値と、ノード２２ａに対応するノード２３ａから出力される値とが一致したとしても、選択された中間層２２及び２３の機能が類似していると判断することはできない。 As a result, even when the intermediate layers 22 and 23 have similar functions, the weighting coefficients of the two signal paths that are in a correspondence relationship in the sections A1 and B1 are not the same. For example, in FIG. 4, the signal path between the node 21b and the node 22a in the section A1 corresponds to the signal path between the node 22b and the node 23a in the section B1. However, the weighting factors of these two signal paths are not the same. Even if the value output from the node 22a matches the value output from the node 23a corresponding to the node 22a, it cannot be determined that the functions of the selected intermediate layers 22 and 23 are similar.

そこで、機械学習装置１００は、選択した２つの中間層の各々に仮出力層を接続し、２つの仮出力層からの出力値の類似度に基づいて、選択した２つの中間層の機能が類似するか否かを判断する。 Therefore, the machine learning device 100 connects the temporary output layer to each of the two selected intermediate layers, and the functions of the two selected intermediate layers are similar based on the similarity of the output values from the two temporary output layers. Judge whether to do.

一方の中間層に接続される仮出力層が有するノードの数を、他方の中間層に接続される仮出力層が有するノードの数と一致させる。また、２つの仮出力層が有する各々のノードを、出力層が有するノードに一対一に対応させる。これにより、一方の中間層に接続される仮出力層が有するノードと、他方の中間層に接続される仮出力層が有するノードとの対応関係を明確にすることができる。 The number of nodes included in the temporary output layer connected to one intermediate layer is made equal to the number of nodes included in the temporary output layer connected to the other intermediate layer. In addition, each node included in the two temporary output layers is made to correspond one-to-one with a node included in the output layer. As a result, it is possible to clarify the correspondence between a node included in the temporary output layer connected to one intermediate layer and a node included in the temporary output layer connected to the other intermediate layer.

一方の中間層と、一方の中間層に接続された仮出力層との間の信号経路における重み計数は、追加学習（ステップＳ４）により更新される。追加学習では、仮出力ノードに入力される誤差が、出力ノード２６ａ〜２６ｃから出力すべき正解値から算出される。従って、ニューラルネットワーク３Ｂにテストデータ５を入力した場合、仮出力層から出力される仮出力値は、一方の中間層が有する各ノードから出力される出力値と、出力ノードから出力される通常出力値４６との関係を示すパラメータに相当する。つまり、仮出力層が有するノードから出力される仮出力値を、仮出力層が接続された中間層の機能を反映したパラメータとして使用することが可能である。 The weight count in the signal path between one intermediate layer and the temporary output layer connected to one intermediate layer is updated by additional learning (step S4). In the additional learning, the error input to the temporary output node is calculated from the correct value to be output from the output nodes 26a to 26c. Therefore, when the test data 5 is input to the neural network 3B, the temporary output value output from the temporary output layer includes the output value output from each node of one intermediate layer and the normal output output from the output node. This corresponds to a parameter indicating the relationship with the value 46. That is, the temporary output value output from the node included in the temporary output layer can be used as a parameter reflecting the function of the intermediate layer to which the temporary output layer is connected.

具体的には、仮出力値４２は、中間層２２の出力値に基づいて算出されるため、中間層２２の機能を反映したパラメータである。仮出力値４３は、中間層２３の出力値に基づいて算出されるため、中間層２３の機能を反映したパラメータである。また、仮出力層３２が有する仮出力ノード２３ａ〜２３ｃと、仮出力層３３が有する仮出力ノード３３ａ〜３３ｃとの対応関係が明確となっているため、仮出力値４２ａ〜４２ｃと、仮出力値４３ａ〜４３ｃとの対応関係も明確である。従って、仮出力値４２の確率密度分布と、仮出力値４３の確率密度分布とから算出された類似度を用いることにより、選択された２つの中間層２２及び２３が類似しているか否かを判断することが可能となる。 Specifically, the temporary output value 42 is a parameter reflecting the function of the intermediate layer 22 because it is calculated based on the output value of the intermediate layer 22. The temporary output value 43 is a parameter reflecting the function of the intermediate layer 23 because it is calculated based on the output value of the intermediate layer 23. Further, since the correspondence relationship between the temporary output nodes 23a to 23c included in the temporary output layer 32 and the temporary output nodes 33a to 33c included in the temporary output layer 33 is clear, the temporary output values 42a to 42c and the temporary output The correspondence with the values 43a to 43c is also clear. Therefore, by using the similarity calculated from the probability density distribution of the temporary output value 42 and the probability density distribution of the temporary output value 43, it is determined whether or not the two selected intermediate layers 22 and 23 are similar. It becomes possible to judge.

｛３．１．７．ニューラルネットワーク３Ｃの再構成｝
再構成部１７は、削除判断部１６が２つの中間層のうち下方向に位置する中間層の削除を決定した場合、ニューラルネットワーク３Ａを再構成する(ステップＳ８)。 {3.1.7. Reconfiguration of neural network 3C}
The reconfiguration unit 17 reconfigures the neural network 3A when the deletion determination unit 16 determines to delete the intermediate layer located in the lower direction of the two intermediate layers (step S8).

例えば、中間層２２及び２３のうち中間層２２の削除が決定された場合、再構成部１７は、ニューラルネットワーク３Ａにおいて、入力層２１が有するノードと中間層２２が有するノードとの接続を解除し、中間層２２が有するノードと中間層２３が有するノードとの接続を解除する。そして、再構成部１７は、入力層２１が有するノードと、中間層２３が有するノードとを接続することにより、新たな信号経路を設定する。これにより、再構成部１７は、ニューラルネットワーク３Ａから中間層２２を削除したネットワークとして、ニューラルネットワーク３Ｄを生成する。 For example, when it is determined that the intermediate layer 22 is to be deleted from the intermediate layers 22 and 23, the reconfiguration unit 17 releases the connection between the node included in the input layer 21 and the node included in the intermediate layer 22 in the neural network 3A. The connection between the node included in the intermediate layer 22 and the node included in the intermediate layer 23 is released. Then, the reconfiguration unit 17 sets a new signal path by connecting the node included in the input layer 21 and the node included in the intermediate layer 23. Thereby, the reconfiguration unit 17 generates a neural network 3D as a network in which the intermediate layer 22 is deleted from the neural network 3A.

図５は、ステップＳ８において生成されるニューラルネットワーク３Ｄの構成を示す図である。図５に示すように、ニューラルネットワーク３Ｄは、図２に示すニューラルネットワーク３Ａと異なり、中間層２２を備えていない。ニューラルネットワーク３Ｄにおいて、入力層２１が有する入力ノード２１ａ〜２１ｄが、中間層２３が有するノード２３ａ〜２３ｄと直接接続されている。以下、入力層２１と中間層２３との間の区間を、区間Ａ２と呼ぶ。 FIG. 5 is a diagram showing a configuration of the neural network 3D generated in step S8. As shown in FIG. 5, the neural network 3 D does not include the intermediate layer 22, unlike the neural network 3 A shown in FIG. 2. In the neural network 3D, input nodes 21a to 21d included in the input layer 21 are directly connected to nodes 23a to 23d included in the intermediate layer 23. Hereinafter, a section between the input layer 21 and the intermediate layer 23 is referred to as a section A2.

区間Ａ２における信号経路は、再構成部１７により新たに設定される。再構成部１７は、区間Ａ２における信号経路の重み係数の初期値を、予め定められた分散の範囲内に収まるように設定する。その後、再構成部１７は、ニューラルネットワーク３Ｄの再学習を実行する（ステップＳ９）。 The signal path in the section A2 is newly set by the reconstruction unit 17. The reconstruction unit 17 sets the initial value of the weighting factor of the signal path in the section A2 so as to be within a predetermined dispersion range. Thereafter, the reconfiguration unit 17 performs relearning of the neural network 3D (step S9).

再学習には、ニューラルネットワーク３Ａの学習の際に用いられた学習データが使用される。再学習のアルゴリズムは、ニューラルネットワーク３Ａの学習に用いられたアルゴリズムと同じである。再学習により、ニューラルネットワーク３Ｄにおいて、区間Ａ２における信号経路の重み係数だけでなく、区間Ｃ１、Ｄ１及びＥ１における信号経路の重み計数も更新される。これにより、ニューラルネットワーク３Ｄは、ニューラルネットワーク３Ａと同等の学習精度を有することが可能となる。 For the relearning, learning data used in the learning of the neural network 3A is used. The re-learning algorithm is the same as the algorithm used for learning of the neural network 3A. By re-learning, not only the signal path weight coefficient in the section A2 but also the signal path weight coefficient in the sections C1, D1, and E1 are updated in the neural network 3D. Thereby, the neural network 3D can have the same learning accuracy as the neural network 3A.

｛３．１．８．終了判断｝
再構成部１７によるニューラルネットワーク３Ｄの再学習が終了した場合、終了判断部１８は、中間層を削除する処理を終了するか否かを判断する（ステップＳ１０）。 {3.1.8. End judgment}
When the re-learning of the neural network 3D by the reconfiguration unit 17 ends, the end determination unit 18 determines whether or not to end the process of deleting the intermediate layer (step S10).

終了判断部１８は、ニューラルネットワーク３Ａから削除した中間層の数が予め設定された終了基準値となった場合、中間層を削除する処理を終了すると判断する。削除された中間層の数は、ニューラルネットワーク３Ａが有する中間層の数からニューラルネットワーク３Ｂが有する中間層の数を削除することにより得られる。終了基準値を超える数の中間層がニューラルネットワーク３Ａから削除されている場合、ニューラルネットワーク３Ｄの規模が、機械学習装置１００に入力されるニューラルネットワーク３Ａの規模より大幅に小さくなる。この結果、ニューラルネットワーク３Ｄの学習精度に悪影響を与える可能性がある。終了判断部１８は、ニューラルネットワーク３Ｄの学習精度の低下を防ぐために、削除された中間層の数に基づいて、中間層の削除を継続するか否かを判断する。 When the number of intermediate layers deleted from the neural network 3A reaches a preset reference end value, the end determination unit 18 determines to end the process of deleting the intermediate layer. The number of deleted intermediate layers is obtained by deleting the number of intermediate layers included in the neural network 3B from the number of intermediate layers included in the neural network 3A. When the number of intermediate layers exceeding the end reference value is deleted from the neural network 3A, the scale of the neural network 3D is significantly smaller than the scale of the neural network 3A input to the machine learning device 100. As a result, the learning accuracy of the neural network 3D may be adversely affected. The end determination unit 18 determines whether or not to continue the deletion of the intermediate layer based on the number of deleted intermediate layers in order to prevent a decrease in the learning accuracy of the neural network 3D.

現時点において、中間層２２のみがニューラルネットワーク３Ａから削除されており、削除された中間層の数は、１である。終了基準値が２に設定されている場合、削除された中間層の数は、終了基準値に達していない（ステップＳ１０においてＮｏ）。このため、終了判断部１８は、ニューラルネットワーク３Ａから中間層を削除する処理を継続する。具体的には、終了判断部１８は、再学習済みのニューラルネットワーク３Ｄを、中間層選択部１２及び仮出力層追加部１３に出力する。機械学習装置１００は、再びステップＳ２に戻り、ニューラルネットワーク３Ｄから中間層を削除する処理を実行する。 At present, only the intermediate layer 22 is deleted from the neural network 3A, and the number of deleted intermediate layers is one. When the end reference value is set to 2, the number of deleted intermediate layers has not reached the end reference value (No in step S10). For this reason, the termination determination unit 18 continues the process of deleting the intermediate layer from the neural network 3A. Specifically, the end determination unit 18 outputs the re-learned neural network 3D to the intermediate layer selection unit 12 and the temporary output layer addition unit 13. The machine learning device 100 returns to step S2 again and executes a process of deleting the intermediate layer from the neural network 3D.

｛３．２．中間層２４の削除｝
｛３．２．１．中間層の選択及び仮出力層の追加｝
中間層選択部１２及び仮出力層追加部１３は、終了判断部１８からニューラルネットワーク３Ｄを取得する。中間層選択部１２は、ニューラルネットワーク３Ｄにおいて、前回の選択時において選択しなかった中間層の中から、２つの中間層を選択する（ステップＳ２）。具体的には、前回の選択時において、中間層２２及び２３が選択されている。中間層選択部１２は、ニューラルネットワーク３Ｄが有する中間層２３〜２５のうち、中間層２４及び２５を選択する。 {3.2. Delete intermediate layer 24}
{3.2.1. Select intermediate layer and add temporary output layer}
The intermediate layer selection unit 12 and the temporary output layer addition unit 13 acquire the neural network 3D from the end determination unit 18. The intermediate layer selection unit 12 selects two intermediate layers from the intermediate layers that were not selected at the previous selection in the neural network 3D (step S2). Specifically, the intermediate layers 22 and 23 are selected at the previous selection. The intermediate layer selection unit 12 selects the intermediate layers 24 and 25 among the intermediate layers 23 to 25 included in the neural network 3D.

仮出力層追加部１３は、中間層選択部１２の選択に基づいて、ニューラルネットワーク３Ｄに新たな仮出力層を追加する（ステップＳ３）。図６は、ニューラルネットワーク３Ｄに仮出力層を追加したネットワークであるニューラルネットワーク３Ｅを示す図である。図６に示すニューラルネットワーク３Ｅにおいて、入力層２１及び中間層２２の表示を省略している。 The temporary output layer adding unit 13 adds a new temporary output layer to the neural network 3D based on the selection by the intermediate layer selecting unit 12 (step S3). FIG. 6 is a diagram showing a neural network 3E that is a network in which a temporary output layer is added to the neural network 3D. In the neural network 3E shown in FIG. 6, the display of the input layer 21 and the intermediate layer 22 is omitted.

図６に示すように、仮出力層追加部１３は、中間層選択部１２により選択された中間層２４及び２５のうち、中間層２４のみに仮出力層３４を接続する。仮出力層追加部１３は、中間層２５に仮出力層を接続しない。中間層選択部１２により選択された中間層２５は、出力層２６に隣接している。このため、中間層２５に接続された仮出力層として、出力層２６を使用することが可能であると判断し、中間層２５に仮出力層を接続しない。 As illustrated in FIG. 6, the temporary output layer adding unit 13 connects the temporary output layer 34 only to the intermediate layer 24 among the intermediate layers 24 and 25 selected by the intermediate layer selecting unit 12. The temporary output layer adding unit 13 does not connect the temporary output layer to the intermediate layer 25. The intermediate layer 25 selected by the intermediate layer selector 12 is adjacent to the output layer 26. Therefore, it is determined that the output layer 26 can be used as the temporary output layer connected to the intermediate layer 25, and the temporary output layer is not connected to the intermediate layer 25.

図６に示すように、仮出力層３４は、出力層２６が有するノードと同じ数のノードを備える。仮出力層３４は、仮出力ノード３４ａ〜３４ｃを有する。仮出力ノード３４ａは、出力ノード２６ａに対応する。仮出力ノード３４ｂは、出力ノード２６ｂに対応する。仮出力ノード３４ｃは、出力ノード２６ｃに対応する。 As illustrated in FIG. 6, the temporary output layer 34 includes the same number of nodes as the nodes included in the output layer 26. The temporary output layer 34 includes temporary output nodes 34a to 34c. The temporary output node 34a corresponds to the output node 26a. The temporary output node 34b corresponds to the output node 26b. The temporary output node 34c corresponds to the output node 26c.

仮出力ノード３４ａ〜３４の各々は、中間層２４が有するノード２４ａ〜２４ｄと接続される。仮出力層追加部１３は、仮出力層３４の追加に伴って、中間層２４と仮出力層３４との間の区間Ｈ１における信号経路の重み計数の初期値を設定する。重み計数の初期値は、仮出力層３２及び３３の追加と同様に、予め設定された分散の範囲内に収まるように、ランダムに設定される。 Each of temporary output nodes 34a to 34 is connected to nodes 24a to 24d of intermediate layer 24. The temporary output layer adding unit 13 sets an initial value of the signal path weight count in the section H 1 between the intermediate layer 24 and the temporary output layer 34 with the addition of the temporary output layer 34. Similar to the addition of the temporary output layers 32 and 33, the initial value of the weight count is set at random so as to be within a preset range of dispersion.

｛３．２．２．追加学習｝
追加学習部１４は、仮出力層追加部１３から出力されたニューラルネットワーク３Ｅを入力する。追加学習部１４は、ニューラルネットワーク３Ｅの追加学習を実行する（ステップＳ４）。ニューラルネットワーク３Ｅの追加学習により、区間Ｈ１における信号経路の重み計数が更新される。ニューラルネットワーク３Ｅの追加学習では、区間Ａ１、Ｃ１、Ｄ１及びＥ１における信号経路の重み計数は更新されない。ニューラルネットワーク３Ｅの追加学習は、ニューラルネットワーク３Ｂの追加学習と同様であるため、その詳細な説明を省略する。 {3.2.2. Additional learning}
The additional learning unit 14 receives the neural network 3E output from the temporary output layer adding unit 13. The additional learning unit 14 performs additional learning of the neural network 3E (step S4). By additional learning of the neural network 3E, the signal path weight count in the section H1 is updated. In the additional learning of the neural network 3E, the weighting coefficient of the signal path in the sections A1, C1, D1, and E1 is not updated. Since the additional learning of the neural network 3E is the same as the additional learning of the neural network 3B, detailed description thereof is omitted.

｛３．２．３．類似度の算出｝
演算部１５は、追加学習部１４から出力されたニューラルネットワーク３Ｅを入力する。演算部１５は、取得したニューラルネットワーク３Ｅにテストデータ５を入力して、ニューラルネットワーク３Ｅを用いた演算を実行する（ステップＳ５）。 {3.2.3. Calculation of similarity}
The calculation unit 15 receives the neural network 3E output from the additional learning unit 14. The calculation unit 15 inputs the test data 5 to the acquired neural network 3E, and executes a calculation using the neural network 3E (step S5).

ニューラルネットワーク３Ｅは、演算結果として、仮出力ノード３４ａ〜３４ｃから仮出力値４４ａ〜４４ｃを出力し、出力ノード２６ａ〜２６ｃから通常出力値４６ａ〜４６ｃを出力する。以下、仮出力値４４ａ〜４４ｃを総称する場合、仮出力値４４と記載する。 As a calculation result, the neural network 3E outputs temporary output values 44a to 44c from the temporary output nodes 34a to 34c, and outputs normal output values 46a to 46c from the output nodes 26a to 26c. Hereinafter, the temporary output values 44a to 44c are collectively referred to as the temporary output value 44.

削除判断部１６は、演算部１５から仮出力値４４及び通常出力値４６を取得する。削除判断部１６は、取得した仮出力値４４及び通常出力値４６を用いて、中間層選択部１２により選択された中間層２４及び２５の類似度を算出する（ステップＳ６）。上記式（１）において、ｐ（ｉ）を仮出力値４４の確率密度分布に割り当て、ｑ（ｉ）を通常出力値４６の確率密度分布に割り当てることにより、中間層２４及び２５の類似度を算出することができる。 The deletion determination unit 16 acquires the temporary output value 44 and the normal output value 46 from the calculation unit 15. The deletion determination unit 16 calculates the similarity between the intermediate layers 24 and 25 selected by the intermediate layer selection unit 12 using the acquired temporary output value 44 and normal output value 46 (step S6). In the above equation (1), by assigning p (i) to the probability density distribution of the provisional output value 44 and assigning q (i) to the probability density distribution of the normal output value 46, the similarity of the intermediate layers 24 and 25 is obtained. Can be calculated.

演算部１５は、所定数のテストデータ５をニューラルネットワーク３Ｅに入力し、ニューラルネットワーク３Ｅを用いた演算を実行する（ステップＳ５）。削除判断部１６は、各テストデータ５に対応する仮出力値４４及び通常出力値４６を取得し、各テストデータ５に対応する類似度を算出する（ステップＳ６）。 The calculation unit 15 inputs a predetermined number of test data 5 to the neural network 3E, and executes a calculation using the neural network 3E (step S5). The deletion determining unit 16 acquires the temporary output value 44 and the normal output value 46 corresponding to each test data 5, and calculates the similarity corresponding to each test data 5 (step S6).

｛３．２．４．削除判断｝
所定数のテストデータ５を用いたニューラルネットワーク３Ｅの演算が終了した場合、削除判断部１６は、算出した複数の類似度の平均値を算出し、算出した類似度を削除基準値と比較する。ここで、中間層２４及び２５の類似度が削除基準値よりも大きいと仮定する。この場合、削除判断部１６は、中間層２４及び２５のいずれか一方を削除することができると判断する（ステップＳ７においてＹｅｓ）。削除判断部１６は、中間層２４及び２５のうち下方向に位置する中間層２４を削除することを決定する。 {3.2.4. Deletion judgment}
When the calculation of the neural network 3E using the predetermined number of test data 5 is completed, the deletion determining unit 16 calculates an average value of the plurality of similarities calculated, and compares the calculated similarity with a deletion reference value. Here, it is assumed that the similarity between the intermediate layers 24 and 25 is larger than the deletion reference value. In this case, the deletion determination unit 16 determines that any one of the intermediate layers 24 and 25 can be deleted (Yes in step S7). The deletion determining unit 16 determines to delete the intermediate layer 24 positioned in the downward direction among the intermediate layers 24 and 25.

｛３．２．５．ニューラルネットワークの再構成｝
再構成部１７は、削除判断部１６による中間層２４の削除の決定に基づいて、ニューラルネットワーク３Ｄを再構成する（ステップＳ８）。具体的には、再構成部１７は、ニューラルネットワーク３Ｄにおいて、中間層２３が有するノード２３ａ〜２３ｄと、中間層２４が有するノード２４ａ〜２４ｄとの接続を解除し、中間層２４が有するノード２４ａ〜２４ｄと、中間層２５が有するノード２５ａ〜２５ｄとの接続を解除する。再構成部１７は、中間層２３が有するノード２３ａ〜２３ｄの各々を、中間層２５が有するノード２５ａ〜２５ｄと接続する。これにより、中間層２４がニューラルネットワーク３Ｄから削除される。 {3.2.5. Reconfiguration of neural network}
The reconfiguration unit 17 reconfigures the neural network 3D based on the determination of the deletion of the intermediate layer 24 by the deletion determination unit 16 (step S8). Specifically, the reconfiguration unit 17 releases the connection between the nodes 23a to 23d included in the intermediate layer 23 and the nodes 24a to 24d included in the intermediate layer 24 in the neural network 3D, and the node 24a included in the intermediate layer 24. To 24d and the nodes 25a to 25d included in the intermediate layer 25 are disconnected. The reconfiguration unit 17 connects each of the nodes 23a to 23d included in the intermediate layer 23 to the nodes 25a to 25d included in the intermediate layer 25. As a result, the intermediate layer 24 is deleted from the neural network 3D.

図７は、図６に示すニューラルネットワーク３Ｄから中間層２４を削除したネットワークであるニューラルネットワーク３Ｆを示す図である。図７に示すように、ニューラルネットワーク３Ｆは、図５に示すニューラルネットワーク３Ｄと異なり、中間層２４を備えていない。ニューラルネットワーク３Ｆにおいて、中間層２３が有するノード２３ａ〜２３ｄの各々が、中間層２５が有するノード２５ａ〜２５ｄと直接接続されている。以下、ニューラルネットワーク３Ｆにおける中間層２３と中間層２５との間の区間を、区間Ｃ２と呼ぶ。 FIG. 7 is a diagram showing a neural network 3F that is a network obtained by deleting the intermediate layer 24 from the neural network 3D shown in FIG. As shown in FIG. 7, the neural network 3F does not include the intermediate layer 24, unlike the neural network 3D shown in FIG. In the neural network 3F, each of the nodes 23a to 23d included in the intermediate layer 23 is directly connected to the nodes 25a to 25d included in the intermediate layer 25. Hereinafter, a section between the intermediate layer 23 and the intermediate layer 25 in the neural network 3F is referred to as a section C2.

区間Ｃ２における信号経路は、再構成部１７により新たに設定される。再構成部１７は、区間Ｃ２における信号経路の重み係数の初期値を、予め定められた分散の範囲内に収まるように設定する。その後、再構成部１７は、ニューラルネットワーク３Ｆの再学習を実行する（ステップＳ９）。再学習により、ニューラルネットワーク３Ｆにおいて、区間Ｃ２における信号経路の重み係数だけでなく、区間Ａ２及びＥ１における信号経路の重み計数も更新される。ニューラルネットワーク３Ｆの再学習は、ニューラルネットワーク３Ｄの再学習と同様であるため、その詳細な説明を省略する。 The signal path in the section C2 is newly set by the reconstruction unit 17. The reconstruction unit 17 sets the initial value of the weighting factor of the signal path in the section C2 so as to be within a predetermined dispersion range. Thereafter, the reconfiguration unit 17 performs relearning of the neural network 3F (step S9). By re-learning, not only the signal path weighting coefficient in the section C2 but also the signal path weighting coefficients in the sections A2 and E1 are updated in the neural network 3F. Since the relearning of the neural network 3F is the same as the relearning of the neural network 3D, detailed description thereof is omitted.

再構成部１７によるニューラルネットワーク３Ｆの再学習が終了した場合、終了判断部１８は、ニューラルネットワーク３Ａから中間層を削除する処理を終了するか否かを判断する（ステップＳ１０）。 When the re-learning of the neural network 3F by the reconfiguration unit 17 ends, the end determination unit 18 determines whether or not to end the process of deleting the intermediate layer from the neural network 3A (step S10).

現時点において、中間層２２及び２４がニューラルネットワーク３Ａから削除されているため、削除された中間層の数は、２である。終了基準値が２に設定されている場合、削除された中間層の数が基準値に達している（ステップＳ１０においてＹｅｓ）。この場合、終了判断部１８は、ニューラルネットワーク３Ａから中間層を削除する処理を終了すべきであると判断する。機械学習装置１００は、再学習済みのニューラルネットワーク３Ｆを出力する。 At this time, since the intermediate layers 22 and 24 are deleted from the neural network 3A, the number of deleted intermediate layers is two. When the end reference value is set to 2, the number of deleted intermediate layers has reached the reference value (Yes in step S10). In this case, the end determination unit 18 determines that the process of deleting the intermediate layer from the neural network 3A should be ended. The machine learning device 100 outputs the re-learned neural network 3F.

また、終了判断部１８は、ニューラルネットワーク３Ａが有する全ての中間層を選択した場合、ニューラルネットワーク３Ａから中間層を削除する処理を終了する。全ての中間層が選択された場合、機械学習装置１００は、削除可能な中間層を新たに特定することができないためである。 Further, when all the intermediate layers included in the neural network 3A are selected, the end determination unit 18 ends the process of deleting the intermediate layer from the neural network 3A. This is because when all the intermediate layers are selected, the machine learning device 100 cannot newly specify an intermediate layer that can be deleted.

このように、機械学習装置１００は、入力されたニューラルネットワーク３Ａにおいて２つの中間層を選択し、選択された２つの中間層の各々に仮出力層を接続する。機械学習装置１００は、選択された２つの中間層のうち、一方の中間層に接続された仮出力層から出力される仮出力値と、他方に接続された仮出力層から出力される仮出力値とに基づいて、選択された２つの中間層のうち一方の中間層を削除することができるか否かを判断する。機械学習装置は、削除することができると判断した場合、ニューラルネットワーク３Ａから、選択された２つの中間層のうち１つを削除したネットワークを生成する。これにより、機械学習装置１００は、ニューラルネットワーク３Ａと同等の学習精度を有し、ニューラルネットワーク３Ａよりも規模の小さいニューラルネットワーク３Ｅを作成することができる。つまり、機械学習装置１００は、予め要求された学習精度を有し、かつ、適切な規模のニューラルネットワークを作成することができる。 In this way, the machine learning device 100 selects two intermediate layers in the input neural network 3A, and connects the temporary output layer to each of the two selected intermediate layers. The machine learning device 100 includes a temporary output value output from a temporary output layer connected to one of the two selected intermediate layers, and a temporary output output from a temporary output layer connected to the other intermediate layer. Based on the value, it is determined whether one of the two selected intermediate layers can be deleted. When it is determined that the machine learning device can be deleted, the machine learning device generates a network in which one of the two selected intermediate layers is deleted from the neural network 3A. Thereby, the machine learning device 100 can create a neural network 3E having learning accuracy equivalent to that of the neural network 3A and having a smaller scale than the neural network 3A. In other words, the machine learning device 100 can create a neural network having a learning accuracy required in advance and having an appropriate scale.

｛変形例｝
上記実施の形態において、中間層選択部１２が、ニューラルネットワーク３Ｄ（図５参照）から中間層２４及び２５を選択する例を説明したが、これに限られない。中間層選択部１２は、ニューラルネットワーク３Ｄにおいて、中間層２３を再度選択してもよい。具体的には、中間層選択部１２は、ニューラルネットワーク３Ｄを入力した場合、中間層２３及び２４を選択してもよい。中間層２３及び２４の類似度が削除基準値を超えている場合、中間層２３が、ニューラルネットワーク３Ｄから削除される。 {Modifications}
In the above embodiment, the example in which the intermediate layer selection unit 12 selects the intermediate layers 24 and 25 from the neural network 3D (see FIG. 5) has been described, but the present invention is not limited to this. The intermediate layer selection unit 12 may select the intermediate layer 23 again in the neural network 3D. Specifically, the intermediate layer selection unit 12 may select the intermediate layers 23 and 24 when the neural network 3D is input. If the similarity between the intermediate layers 23 and 24 exceeds the deletion reference value, the intermediate layer 23 is deleted from the neural network 3D.

上記実施の形態において、中間層選択部１２が、下方向に位置する中間層から順に選択する例を説明したが，これに限られない。中間層選択部１２が、中間層を選択する順序は特に限定されず、上方向に位置する中間層から順に選択してもよい。 In the said embodiment, although the intermediate | middle layer selection part 12 demonstrated the example selected in order from the intermediate | middle layer located in a downward direction, it is not restricted to this. The order in which the intermediate layer selection unit 12 selects the intermediate layer is not particularly limited, and may be selected in order from the intermediate layer positioned in the upward direction.

上記実施の形態において、終了判断部１８が、削除された中間層の数が終了基準値に達しているか否かに基づいて、ニューラルネットワーク３Ａから中間層を削除する処理を終了するか否かを判断する例を説明したが、これに限られない。終了判断部１８は、再構成部１７から出力されたニューラルネットワークにおける中間層の数が、予め設定された基準値に達した場合、中間層を削除する処理を終了すると判断してもよい。例えば、基準値が３に設定されている場合を考える。この場合、図６に示すニューラルネットワーク３Ｄが有する中間層の数が３であるため、終了判断部１８は、ニューラルネットワーク３Ｄから中間層２４を削除する処理を実行しない。終了判断部１８は、ニューラルネットワーク３Ｄを機械学習装置１００の外部に出力する。 In the above embodiment, the end determination unit 18 determines whether to end the process of deleting the intermediate layer from the neural network 3A based on whether the number of deleted intermediate layers has reached the end reference value. Although the example to judge was demonstrated, it is not restricted to this. The end determination unit 18 may determine to end the process of deleting the intermediate layer when the number of intermediate layers in the neural network output from the reconfiguration unit 17 reaches a preset reference value. For example, consider a case where the reference value is set to 3. In this case, since the number of intermediate layers included in the neural network 3D illustrated in FIG. 6 is 3, the end determination unit 18 does not execute the process of deleting the intermediate layer 24 from the neural network 3D. The end determination unit 18 outputs the neural network 3D to the outside of the machine learning device 100.

上記実施の形態において、削除判断部１６が、式（１）を用いて、２つの中間層の類似度を算出する例を説明したが、これに限られない。削除判断部１６は、仮出力値４２ａ〜４２ｃの各々の値を、０以上１以下の数値に正規化し、仮出力値４３ａ〜４３ｃの各々の値を同様に正規化する。そして、削除判断部は、対応する２つの仮出力値の差分絶対値を計算し、差分絶対値の合計を類似度として算出してもよい。この場合、類似度は、０に近づくほど、２つの中間層の機能が類似していることを示す。 In the above embodiment, an example has been described in which the deletion determination unit 16 calculates the similarity between two intermediate layers using Expression (1), but the present invention is not limited to this. The deletion determining unit 16 normalizes each value of the temporary output values 42a to 42c to a numerical value of 0 or more and 1 or less, and similarly normalizes each value of the temporary output values 43a to 43c. Then, the deletion determining unit may calculate a difference absolute value between two corresponding temporary output values, and calculate the sum of the difference absolute values as the similarity. In this case, the degree of similarity indicates that the functions of the two intermediate layers are similar as it approaches zero.

また、上記実施の形態における機械学習装置１００の各機能ブロック（各機能部）の処理の一部または全部は、プログラムにより実現されるものであってもよい。そして、上記実施の形態の機械学習装置１００において、各機能ブロックの処理の一部または全部は、コンピュータにおいて、中央演算装置（ＣＰＵ）により行われる。また、それぞれの処理を行うためのプログラムは、ハードディスク、ＲＯＭなどの記憶装置に格納されており、ＲＯＭにおいて、あるいはＲＡＭに読み出されて実行される。例えば、機械学習装置１００の構成を、図１３に示すような構成とすることにより、上記各実施形態の各機能ブロック（各機能部）の処理の一部または全部が実行されるものであっても良い。 In addition, part or all of the processing of each functional block (each functional unit) of the machine learning device 100 in the above embodiment may be realized by a program. In the machine learning device 100 of the above embodiment, part or all of the processing of each functional block is performed by a central processing unit (CPU) in the computer. In addition, a program for performing each processing is stored in a storage device such as a hard disk or a ROM, and is read out and executed in the ROM or the RAM. For example, by configuring the machine learning device 100 as shown in FIG. 13, a part or all of the processing of each functional block (each functional unit) in each of the above embodiments is executed. Also good.

また、上記実施の形態の各処理をハードウェアにより実現してもよいし、ソフトウェア（ＯＳ（オペレーティングシステム）、ミドルウェア、あるいは、所定のライブラリとともに実現される場合を含む。）により実現してもよい。さらに、ソフトウェアおよびハードウェアの混在処理により実現しても良い。 In addition, each process of the above embodiment may be realized by hardware, or may be realized by software (including a case where it is realized together with an OS (operating system), middleware, or a predetermined library). . Further, it may be realized by mixed processing of software and hardware.

また、上記実施の形態における処理方法の実行順序は、必ずしも、上記実施形態の記載に制限されるものではなく、発明の要旨を逸脱しない範囲で、実行順序を入れ替えることができるものである。 Moreover, the execution order of the processing method in the said embodiment is not necessarily restricted to description of the said embodiment, The execution order can be changed in the range which does not deviate from the summary of invention.

前述した方法をコンピュータに実行させるコンピュータプログラム及びそのプログラムを記録したコンピュータ読み取り可能な記録媒体は、本発明の範囲に含まれる。ここで、コンピュータ読み取り可能な記録媒体としては、例えば、フレキシブルディスク、ハードディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、大容量ＤＶＤ、次世代ＤＶＤ、半導体メモリを挙げることができる。 A computer program that causes a computer to execute the above-described method and a computer-readable recording medium that records the program are included in the scope of the present invention. Here, examples of the computer-readable recording medium include a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, large-capacity DVD, next-generation DVD, and semiconductor memory. .

上記コンピュータプログラムは、上記記録媒体に記録されたものに限られず、電気通信回線、無線又は有線通信回線、インターネットを代表とするネットワーク等を経由して伝送されるものであってもよい。 The computer program is not limited to the one recorded on the recording medium, and may be transmitted via a telecommunication line, a wireless or wired communication line, a network represented by the Internet, or the like.

また、文言「部」は、「サーキトリー（ｃｉｒｃｕｉｔｒｙ）」を含む概念であってもよい。サーキトリーは、ハードウェア、ソフトウェア、あるいは、ハードウェアおよびソフトウェアの混在により、その全部または一部が、実現されるものであってもよい。 In addition, the word “part” may be a concept including “circuitry”. The circuit may be realized in whole or in part by hardware, software, or a mixture of hardware and software.

以上、本発明の実施の形態を説明したが、上述した実施の形態は本発明を実施するための例示に過ぎない。よって、本発明は上述した実施の形態に限定されることなく、その趣旨を逸脱しない範囲内で上述した実施の形態を適宜変形して実施することが可能である。 While the embodiments of the present invention have been described above, the above-described embodiments are merely examples for carrying out the present invention. Therefore, the present invention is not limited to the above-described embodiment, and can be implemented by appropriately modifying the above-described embodiment without departing from the spirit thereof.

１００機械学習装置
１１ネットワーク取得部
１２中間層選択部
１３仮出力層追加部
１４追加学習部
１５演算部
１６削除判断部
１７再構成部
１８終了判断部 DESCRIPTION OF SYMBOLS 100 Machine learning apparatus 11 Network acquisition part 12 Intermediate | middle layer selection part 13 Temporary output layer addition part 14 Additional learning part 15 Calculation part 16 Deletion judgment part 17 Reconstruction part 18 End judgment part

Claims

An acquisition unit for acquiring a learned neural network including at least two intermediate layers;
An intermediate layer selection unit for selecting a first intermediate layer and a second intermediate layer from the at least two intermediate layers;
A first temporary output layer having a temporary output node connected to a node of the first intermediate layer; and a second temporary output layer having a temporary output node connected to a node of the second intermediate layer. A temporary output layer adding unit for generating a temporary network in addition to the network;
The test data is input to the temporary network, the calculation using the temporary network is executed, the first temporary output value output from the temporary output node of the first temporary output layer, and the second temporary output layer An arithmetic unit for obtaining a second temporary output value output from the temporary output node having;
The first temporary output value and the second temporary output value are used to calculate a similarity indicating the degree to which the function of the first intermediate layer is similar to the function of the second intermediate layer, and based on the calculated similarity A deletion determination unit for determining whether to delete the one intermediate layer;
A reconfiguration unit that deletes the first intermediate layer from the neural network when the deletion determination unit determines to delete the first intermediate layer;
A machine learning device comprising:

The machine learning device according to claim 1,
The machine learning device, wherein the first intermediate layer is disposed closer to an input layer included in the neural network than the second intermediate layer.

The machine learning device according to claim 1 or 2,
The machine learning device, wherein the number of temporary output nodes included in the first temporary output layer is the same as the number of temporary output nodes included in the second temporary output layer.

The machine learning device according to claim 3,
The machine learning device, wherein the number of temporary output nodes included in the first temporary output layer is the same as the number of output nodes included in the output layer in the neural network.

The machine learning device according to any one of claims 1 to 4,
The provisional output layer adding unit determines to use the output layer as the second provisional output layer when a node of the second intermediate layer is connected to a node of the output layer in the neural network. Machine learning device.

The machine learning device according to any one of claims 1 to 5, further comprising:
A weighting factor set in a signal path between a node included in the first intermediate layer and a node included in the first temporary output layer; a node included in the second intermediate layer; and a node included in the second temporary output layer. An additional learning unit that performs learning to update the weighting factor set in the signal path between
A machine learning device comprising:

The machine learning device according to any one of claims 1 to 6, further comprising:
When the first intermediate layer is a kth (k is a natural value between 2 and m−1) counting from the input layer of the neural network, the (k−1) th layer is And the node of the (k + 1) th layer is newly connected to the node of the (k + 1) th layer and the signal path between the node of the (k-1) th layer and the node of the (k + 1) th layer is set. A machine learning device that performs learning for updating a weighting coefficient.

Obtaining a trained neural network comprising at least two intermediate layers;
Selecting a first intermediate layer and a second intermediate layer from the at least two intermediate layers;
A first temporary output layer having a temporary output node connected to a node of the first intermediate layer; and a second temporary output layer having a temporary output node connected to a node of the second intermediate layer. Adding to the network and generating a temporary network;
The test data is input to the temporary network, the calculation using the temporary network is executed, the first temporary output value output from the temporary output node of the first temporary output layer, and the second temporary output layer Obtaining a second temporary output value output from the temporary output node having;
The first temporary output value and the second temporary output value are used to calculate a similarity indicating the degree to which the function of the first intermediate layer is similar to the function of the second intermediate layer, and based on the calculated similarity Determining whether to delete the one intermediate layer;
If it is determined to delete the first intermediate layer, deleting the first intermediate layer from the neural network;
A machine learning method comprising: