WO2016189675A1 - Neural network learning device and learning method - Google Patents

Neural network learning device and learning method Download PDF

Info

Publication number
WO2016189675A1
WO2016189675A1 PCT/JP2015/065159 JP2015065159W WO2016189675A1 WO 2016189675 A1 WO2016189675 A1 WO 2016189675A1 JP 2015065159 W JP2015065159 W JP 2015065159W WO 2016189675 A1 WO2016189675 A1 WO 2016189675A1
Authority
WO
WIPO (PCT)
Prior art keywords
learning
training data
neural network
unit
divided
Prior art date
Application number
PCT/JP2015/065159
Other languages
French (fr)
Japanese (ja)
Inventor
泰幸 工藤
純一 宮越
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to JP2017520142A priority Critical patent/JP6258560B2/en
Priority to PCT/JP2015/065159 priority patent/WO2016189675A1/en
Publication of WO2016189675A1 publication Critical patent/WO2016189675A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to a learning device and a learning method for a neural network.
  • supervised machine learning in which the relationship between system inputs and outputs is modeled by a neural network (hereinafter abbreviated as NN) to predict responses to unknown inputs or classify patterns, is widely used.
  • NN neural network
  • supervised machine learning requires a learning phase, and prepares multiple training data sets consisting of inputs and outputs until the relationship between all training data inputs and outputs satisfies a predetermined standard. , Adjust various parameters related to NN.
  • Patent Document 1 describes a method in which NNs are separated and merged after each NN performs desired learning.
  • Patent document 2 a method of (patent document 2) as another method of solving the said subject.
  • This publication describes a method for preferentially learning non-conforming training data in which the relationship between input and output does not satisfy the standard.
  • the training data to be learned by the NN divided into two are alphabets A to L and M to Z, respectively. That is, this method is based on the premise that the user classifies training data to be input in advance. Therefore, when the user does not know how to classify the training data, for example, when training data having the same output is divided into two, the application is difficult. Further, the above method is highly effective when the learning times for A to L and M to Z are equal, but is less effective when the learning times of both are greatly different. In general, since the learning time greatly depends on the characteristics of the training data, it is difficult to stably obtain a high effect.
  • Patent Literature 2 preferentially learns non-conforming training data and adjusts various parameters related to the NN. For this reason, in the stage of re-learning all the subsequent training data, some of the training data that have been adapted so far may be changed to non-conforming. This tendency becomes conspicuous when, for example, the characteristics of conformity and nonconformity training data are greatly different, and it is considered that much time is required for re-learning.
  • the present invention has been made in view of the above-mentioned problems, and its purpose is to stably shorten the learning time of NN even for training data that is difficult for a user to classify or training data having greatly different characteristics. And providing a learning device and a learning method.
  • the present application includes a plurality of means for solving the above-described problems.
  • the application includes a neural network dividing unit that divides a neural network into a plurality of neural networks, and a plurality of samples used for learning the neural network.
  • a training data dividing unit that divides training data into a plurality of training data, and any one of the plurality of neural networks is uniquely assigned to each of the plurality of training data, and learning by the assigned neural network
  • the neural network integration unit for generating an integrated neural network by integrating the plurality of neural networks in which the execution result of the learning satisfies a predetermined condition, and the training data before the division
  • the above Presenting learning device characterized by comprising an integrated neural network learning portion that performs learning by the neural network.
  • the NN learning time can be stably reduced even for training data that is difficult for the user to classify or training data having greatly different characteristics. Further, it is possible to minimize the resources necessary for forming the NN.
  • Example 1 shows a method of shortening the learning time by dividing the NN into two parts, learning using different training data, and then re-learning all training data with the integrated NN.
  • this method will be described with reference to FIGS.
  • FIG. 1 is a functional block diagram of the information processing apparatus according to the first embodiment, which is roughly composed of an NN learning device and a storage device.
  • 101 is an NN learning device
  • 102 is an NN dividing unit
  • 103 is a training data dividing unit
  • 104 to 105 are divided NN learning units
  • 106 is an NN integrating unit
  • 107 is an integrated NN learning unit.
  • Reference numeral 108 denotes a storage device
  • 109 denotes an NN information storage unit
  • 110 denotes a division information storage unit
  • 111 denotes a training data storage unit
  • 112 denotes a learning result storage unit. All the information in these storage units is read and written by the user. It shall be possible.
  • the NN learning device 101 includes a processor and a memory as a hardware configuration. Various functions of the NN learning device 101 can be realized by executing a program stored in the memory by the processor.
  • 201 is NN division
  • 202 is training data division
  • 203 is division NN learning
  • 204 is setting change
  • 205 is NN integration
  • 206 is integrated NN learning.
  • the NN division 201 is executed. This processing is realized by the NN dividing unit 102, and NN given from the NN information storage unit 109 is divided into two based on the division information given from the division information storage unit 110.
  • the division information given from the division information storage unit 110 includes information on how many divisions the NN is divided at. In the first embodiment, it is assumed that the information that “NN is equally divided into two, one of which is assigned to the divided NN learning unit 104 and the other is assigned to the divided NN learning unit 105” is included. Information shall be included.
  • 301 is an input block
  • 302 to 303 are first layer feature extraction blocks
  • 304 to 305 are second layer feature extraction blocks
  • 306 is a link between feature extraction blocks
  • 307 is an output block.
  • the NN of the first embodiment assumes a convolution NN as its form, and one feature extraction block is formed of a convolution layer (unit of initial c) and a pooling layer (unit of initial p). . Therefore, each of the feature extraction blocks has a function of independently extracting the features of the training data. Considering this function, it can be considered that the method of using the feature extraction block as a unit is efficient for dividing the NN, and as a result, the relearning time after the NN integration can be shortened.
  • the NN division processing 201 can be realized by the configuration and operation described above.
  • training data division 202 is executed. This process is realized by the training data dividing unit 103 and divides the training data given from the training data storage unit 110 into two based on the division information given from the division information storage unit 110.
  • the division information given from the division information storage unit 110 includes information on how to group and divide the training data.
  • information that “divides training data into two equal parts in the first half and the second half of the sample number and assigns the first half to the divided NN learning unit 104 and the second half to the divided NN learning unit 105” is included. To do.
  • an image of training data given from the training data storage unit 111 is shown in FIG. As shown in FIG.
  • the training data in the training data, one sample is composed of a set of input vectors iv1 to iv4 and output vectors ov1 to ov2, and 400 samples are prepared. Therefore, when the training data is divided into two according to the above instructions, the training data may be divided into two groups of sample numbers 1 to 200 and 201 to 400. In the first embodiment, it is assumed that the value of the input vector continuously changes with respect to the sample number. In this case, if the first half and the second half of the sample numbers are grouped, the possibility that a large difference will occur in the characteristics of the training data of each group increases.
  • the divided NN learning 203 is executed next. This process is realized by the divided NN learning units 104 to 105, and learning of the divided NN is performed in parallel using the divided training data.
  • 601 is an initial setting
  • 602 is a head training data input
  • 603 is an NN calculation
  • 604 is an error calculation
  • 605 is training data update
  • 606 is parameter adjustment
  • 607 is a result list generation.
  • initial setting 601 is first executed. This process mainly involves initial setting of the link coupling coefficient and bias value.
  • a random number from ⁇ 1 to +1 is set as an initial value as a predetermined value.
  • the leading training data input 602 is executed next. Specifically, the training data sample number 1 shown in FIG. 5 is selected, and the input vector is transferred to the input block 401 of the divided NN and the output vector is transferred to the output block 403.
  • the NN operation 603 is executed. This process corresponds to a “forward operation” in the convolution NN, and performs various processes such as convolution (filtering), pooling (partial sampling), and activation on the input training data, and finally the output block 403 Calculate the output value of each unit.
  • the error calculation 604 compares the calculation result on the left with the output vector of the training data, and calculates the error. Thereafter, training data update 605 is executed. This processing corresponds to an operation of selecting training data of the next sample number in the first embodiment. Then, the NN calculation 603 and the error calculation 604 are executed again on the training data, and these series of processes are repeated until the calculation of all the training data is completed. Thereafter, if the total sum of errors is less than or equal to a predetermined reference, it is determined that learning has succeeded, result list creation 606 is executed, and learning ends.
  • the result list includes information indicating success or failure of learning, error for each training data, latest information on various parameter setting values related to the divided NN, and the like.
  • parameter adjustment 607 is executed. This process corresponds to “backward calculation” in the convolution NN, and the error propagation in the output block 403 is calculated using the error propagation method. Back propagate in the direction of the input block 401. As a result, it is possible to correct the coupling coefficient and bias value of each link so that the error becomes smaller.
  • the “forward calculation” using the training data is executed again, and the parameter adjustment 607 is repeated until the total sum of errors is below the reference. If the processing time exceeds a predetermined standard during this operation, it is determined that learning has failed, result list creation 606 is performed, and learning ends.
  • a setting change 204 is executed.
  • this process is performed by the user, for example, for changing the initial setting 601 of the divided NN, increasing the number of layers and units of the divided NN, and further dividing the divided NN and training data.
  • the contents of the NN information storage unit 109 and the contents of the NN division information storage unit 110 are corrected. Thereafter, learning of the divided NN is executed again, and these series of processes are repeated until learning of all the divided NNs is successful.
  • NN integration 205 is executed next. This processing is realized by the NN integration unit 106. First, the duplicated input blocks 401 and 402 and output blocks 403 and 404 are shared by one block. Further, the link 306 deleted by the NN dividing process is restored. With this process, it is possible to integrate the NN.
  • Integral NN learning 206 is executed after the integration of the divided NNs. This process is realized by the integrated NN learning unit 107, and the content thereof is almost the same as that of the partial NN learning 203. The different processing is that the coupling coefficient and the bias value adjusted by the divided NN learning 203 are reflected on the initial value of each link in the integrated NN. Note that the initial value of the restored link 306 is a random number from ⁇ 1 to +1, for example, because there is no learning result.
  • the setting change 204 is executed again, and this series of processing is repeated until the integrated NN is successfully learned.
  • the latest information on various parameter setting values related to the integrated NN is transferred to the NN information storage unit 109 and used in the test phase after the learning phase.
  • the information processing apparatus divides the convolution NN into two by using the feature extraction block as a unit, and learns the division NN in parallel using the training data divided in the first half and the second half of the sample number, and then Retrain all training data with the integrated NN.
  • This makes it possible to generate a segmented NN with high identification capability for training data in which the value of the input vector changes continuously with respect to the sample number, and shorten the time for relearning after NN integration. Is possible. Therefore, it is possible to provide an information processing method and system capable of shortening the NN learning time, which is an object of the present invention.
  • the number of divisions of the NN and the training data is two.
  • the number of divisions is not limited to this, and may be set to three or more.
  • the balance of the division of the NN and the training data is made equal, the present invention is not limited to this, and may be set unevenly.
  • the number of training data samples, the number of input / output vector elements, and the like can be freely set.
  • Example 2 shows a method for appropriately dividing training data when the characteristics of the training data are unknown.
  • FIG. 7 is a functional block diagram of the information processing apparatus according to the first embodiment, in which 701 is a division information storage unit and 702 is an NN analysis unit. Since the other parts perform the same processing as the part shown in FIG. 1 of the first embodiment, the same numbers are given.
  • the feature of the processing of the second embodiment is that a training data analysis process is newly added to the training data division 202 shown in FIG.
  • the data division 202 is realized by the NN analysis unit 702 illustrated in FIG. 7, and divides the training data provided from the training data storage unit 111 based on the division information provided from the division information storage unit 701.
  • the division information given from the division information storage unit 701 includes information on how many divisions the training data is divided at what ratio.
  • the NN analysis unit 702 executes analysis for equally dividing the training data into two.
  • clustering that classifies training data based on the proximity of the Euclidean distance is used.
  • a K-average method can be cited.
  • the analysis by clustering divides the training data into two groups, and further generates information about which sample number of training data belongs to which group.
  • the information processing apparatus divides the convolution NN into two parts, learns the divided NNs in parallel using the training data divided into two parts by clustering analysis, and then all the training data with the integrated NNs. Let them learn again. Thereby, it is possible to generate a divided NN with high identification ability for training data whose characteristics are unknown, and it is possible to shorten the time for relearning after NN integration. Therefore, an object of the present invention is to provide an information processing method and system capable of stably reducing the learning time of an NN even for training data that is difficult for a user to classify or training data having greatly different characteristics. Is possible.
  • the number and balance of NN and training data, the number of training data samples, the number of input / output vector elements, and the like can be freely set. Furthermore, it is possible to switch between the first embodiment and the second embodiment as different processing modes. This makes it possible to stably shorten the learning time of the NN with respect to training data having more various characteristics.
  • Example 3 shows another method for appropriately dividing training data when the characteristics of the training data are unknown.
  • this method will be described with reference to FIGS.
  • FIG. 8 is a functional block diagram of the information processing apparatus according to the third embodiment, in which 801 is a division information storage unit and 802 is a division adjustment unit. Since the other parts perform the same processing as the part shown in FIG. 1 of the first embodiment, the same numbers are given.
  • the feature of the processing of the third embodiment is that the training data is adaptively classified by using the NN as a classifier. Details of this processing will be described below with reference to the flowchart of FIG. 9, 901 is initial NN learning, 902 is non-conformance training data extraction, 903 is NN addition, 904 is additional NN learning, 905 is NN integration, 906 is integrated NN learning, and 907 is a result list creation.
  • the initial NN learning 901 is executed.
  • This process is almost the same as the divided NN learning 203 of the first embodiment, but is characterized in that all training data is learned by one divided NN, for example, the upper divided NN in FIG.
  • the division information output by the division adjustment unit 802 may be devised.
  • the division information to be output to the NN division unit 102 is “divide NN equally into two and assign one of them to the division NN learning unit 104”. This can be realized by assigning to the divided NN learning unit 104.
  • the result list creation 907 is executed and the learning is completed, but if it is unsuccessful, the non-conforming training data extraction 902 is executed.
  • This process is realized by the division adjustment unit 802, and the error calculated in the initial initial NN learning 901 is checked for each training data, and training data whose error is equal to or more than a predetermined standard is extracted as non-conforming training data.
  • NN addition 903 and additional NN learning 904 are executed. Prior to this, if necessary, resources necessary for forming the NN can be secured, that is, there is an unlearned divided NN that can be added. If the NN cannot be added, the result list creation 907 is executed and the learning ends.
  • the processing of the NN addition 903 and the additional NN learning 904 is almost the same as the divided NN learning 203 of the first embodiment, but the non-conforming training data is learned by a different divided NN, for example, the lower divided NN in FIG. There is a feature in the point.
  • the division information output from the division adjustment unit 802 may be updated.
  • the division information to be output to the NN division unit 102 is “assign the remaining division NN divided into two to the division NN learning unit 105”
  • the division information to the training data division unit 103 is “divide non-conforming training data into NN learning unit” It can be realized by “assign to 105”. If the learning fails as a result of learning the additional NN, the non-conforming training data extraction 902 is executed again, and a series of processes for learning the non-conforming training data by adding the NN is performed. Repeat until it becomes impossible to secure.
  • the NN integration 905 and the integrated NN learning 906 are executed. These processes are the same as the NN integration and the integrated NN learning 206 of the first embodiment.
  • the third embodiment there is an unlearned division NN, but the unlearned division NN is not integrated, and the learned division NN (additional NN) is integrated.
  • the learned division NN additional NN
  • nonconformity training data extraction 902 is performed again, and a series of processing which learns nonconformity training data by adding NN, Iterate until integrated NN learning is successful or resources cannot be secured.
  • the latest information on various parameter setting values related to the integrated NN is transferred to the NN information storage unit 109 and used in the test phase after the learning phase.
  • the result list generated by the result list creation 907 includes information indicating success or failure of learning, error for each training data, latest information on various parameter setting values related to the divided NN, and the like. These are transferred to the learning result storage unit 112 and used for analysis of the learning result of the user.
  • the information processing apparatus divides the convolution NN into two parts, classifies the training data as conforming and nonconforming using one of the subdividing NNs. Is used to re-learn the training data, and then all the training data is re-learned with the integrated NN. Thereby, it is possible to generate a divided NN with high identification ability for training data whose characteristics are unknown, and it is possible to shorten the time for relearning after NN integration. Therefore, an object of the present invention is to provide an information processing method and system capable of stably reducing the learning time of an NN even for training data that is difficult for a user to classify or training data having greatly different characteristics. Is possible.
  • the NN is adaptively added according to the learning result, resources necessary for forming the NN can be saved. Furthermore, compared with the method of clustering the training data shown in the second embodiment, the method of the third embodiment is more direct, so the training time for the NN can be stably reduced for training data with more diverse characteristics. It becomes possible to do.
  • the number and balance of NN and training data, the number of training data samples, the number of input / output vector elements, and the like can be freely set. Further, the first to third embodiments can be switched as different processing modes. This makes it possible to stably shorten the NN learning time for training data having various characteristics.
  • Example 3 in the initial NN learning 901, all training data is learned and non-conforming training data is extracted.
  • the present invention is not limited to this.
  • FIG. 10 shows the result of actually learning NN using this method. It can be seen that the present invention succeeds in learning with a very small number of learning repetitions compared to a method that does not perform divided learning.
  • Embodiment 4 shows a method for more efficiently realizing the division and addition of NNs in the information processing apparatuses of Embodiments 1 to 3.
  • the feature of the processing of the fourth embodiment is that the training data is classified according to the magnitude of the error obtained by the error calculation 604, and the NN division and the additional policy are determined according to the result.
  • this method will be described with reference to FIG.
  • FIG. 11 is an example of a list of errors calculated by the error calculation 604. This error list is realized by the divided NN learning units 104 to 105. As shown in FIG. 11, for example, training data is sorted in descending order of error, and further classified according to the magnitude of error. In FIG. 11, the number of samples is reduced to 20 to simplify the description.
  • the information processing apparatus determines NN division and additional policies according to the magnitude of error in the training data in addition to the configuration and operation described in the first to third embodiments. To do. As a result, the success rate of NN learning can be further increased.
  • the number of error level classifications shown in FIG. 11 is four, it is not limited to this, and it is desirable to set it appropriately in consideration of resource conditions and the like.
  • Embodiment 5 shows a method for minimizing resources necessary for forming an NN in the information processing apparatuses of Embodiments 1 to 4.
  • the characteristic of the processing of the fifth embodiment is that the NN scale is reduced and re-learning is performed on the NN that has succeeded in learning, and if the learning succeeds again, excess NN resources are accumulated.
  • FIG. 12 is a flowchart for realizing the resource minimum processing, where 1201 is NN scale reduction, 1202 is reduced NN learning, and 1203 is surplus resource accumulation. These processes are realized by the divided NN learning units 104 and 105.
  • the NN scale reduction 1201 receives the divided NN, the initial NN, or the additional NN that has been successfully learned, and reduces the scale of the NN by a predetermined amount.
  • a method for reducing the NN scale for example, reduction in units of intermediate layers, feature extraction blocks, or units can be considered.
  • the reduced NN learning 1202 is executed. This process is the same as the divided NN learning 203 of the first embodiment, and the description thereof is omitted.
  • the surplus resource accumulation 1203 is executed.
  • the NN resources reduced by the NN scale reduction 1201 are accumulated as surplus resources. Thereafter, the NN scale reduction 1201 is executed again, and this series of processing is repeated until learning of the reduction NN fails.
  • the accumulated surplus resources are transferred to the NN information storage unit 109, and the resource minimization process is completed.
  • the accumulated surplus resources are recycled when a new NN needs to be added.
  • the information processing method and system thereof according to the fifth embodiment is a resource minimization process for minimizing the resources necessary for forming the NN in addition to the configurations and operations shown in the first to fourth embodiments. I do.
  • the learning time of the NN can be shortened with fewer resources. Further, with limited resources, it is possible to stably shorten the learning time of the NN for training data with more diverse characteristics.
  • the NN is convoluted with the NN.
  • the present invention is not limited to this and can be applied to other types of NN.

Abstract

There have been cases in which a large amount of learning time is required when training data that is difficult for a user to classify or training data having different characteristics is to be learned in a neural network (NN) learning phase. The present invention divides training data in accordance with the characteristics of the training data, causes the divided training data to be learned in separated NNs, and thereafter causes the entirety of the training data to be relearned in an integrated NN.

Description

ニューラルネットワークの学習装置及び学習方法Neural network learning apparatus and learning method
 本発明は、ニューラルネットワークの学習装置および学習方法に関する。 The present invention relates to a learning device and a learning method for a neural network.
 システムの入力と出力の関係をニューラルネットワーク(以下、NNと略する)によってモデル化し、未知の入力に対する反応を予測したり、パターンを分類したりする、いわゆる教師あり機械学習が広く活用されている。一般的に、教師あり機械学習においては学習フェーズが必要であり、入力と出力をセットとする訓練データを複数用意し、全ての訓練データの入力と出力の関係が予め定められた基準を満たすまで、NNに係わる各種パラメータを調整する。 So-called supervised machine learning, in which the relationship between system inputs and outputs is modeled by a neural network (hereinafter abbreviated as NN) to predict responses to unknown inputs or classify patterns, is widely used. . In general, supervised machine learning requires a learning phase, and prepares multiple training data sets consisting of inputs and outputs until the relationship between all training data inputs and outputs satisfies a predetermined standard. , Adjust various parameters related to NN.
 ここで、訓練データが有する入出力ベクトルの要素数が多い場合や、訓練データのサンプル数が多い場合、問題が複雑化するため、各種パラメータの調整に多くの時間を必要とすることが多い。この課題を解決する方法として(特許文献1)記載の方法がある。この公報には、NNを分離して、それぞれのNNに所望の学習を行わせた後に併合する方法が記載されている。また、上記課題を解決する別の方法として(特許文献2)記載の方法がある。この公報には、入力と出力の関係が基準を満たさない不適合の訓練データを、優先的に学習させる方法が記載されている。 Here, when the number of input / output vector elements included in the training data is large or when the number of samples of the training data is large, the problem becomes complicated, so that it often takes a lot of time to adjust various parameters. As a method for solving this problem, there is a method described in (Patent Document 1). This publication describes a method in which NNs are separated and merged after each NN performs desired learning. Moreover, there exists a method of (patent document 2) as another method of solving the said subject. This publication describes a method for preferentially learning non-conforming training data in which the relationship between input and output does not satisfy the standard.
特表平7-502357号明細書No. 7-502357 特開平5-197700号明細書Japanese Patent Laid-Open No. 5-197700
 (特許文献1)記載の方法において、例えば2分割したNNに学習させる訓練データとして、それぞれアルファベットのA~L,M~Zであることが記載されている。つまりこの方法は、入力すべき訓練データを、ユーザが予め分類しておくことを前提としている。このため、ユーザがどの様に訓練データを分類すべきかが判らない場合、例えば、同じ出力を有する訓練データを2分割する場合などには、適用が困難である。さらに上記の方法は、A~LおよびM~Zに対する学習時間が等しい場合には効果が高いものの、双方の学習時間が大きく異なる場合には効果が薄れる。一般に、学習時間は、訓練データの特性に大きく依存するため、高い効果を安定的に得ることは困難である。 In the method described in (Patent Document 1), for example, it is described that the training data to be learned by the NN divided into two are alphabets A to L and M to Z, respectively. That is, this method is based on the premise that the user classifies training data to be input in advance. Therefore, when the user does not know how to classify the training data, for example, when training data having the same output is divided into two, the application is difficult. Further, the above method is highly effective when the learning times for A to L and M to Z are equal, but is less effective when the learning times of both are greatly different. In general, since the learning time greatly depends on the characteristics of the training data, it is difficult to stably obtain a high effect.
 一方、(特許文献2)記載の方法は、不適合の訓練データを優先的に学習させ、NNに係わる各種パラメータを調整する。このため、その後の全訓練データを再学習する段階において、これまで適合していた訓練データの幾つかが、不適合に変化する可能性がある。この傾向は、例えば適合と不適合の訓練データの特性が大きく異なる場合に顕著となり、再学習に多くの時間が必要になると考えられる。 On the other hand, the method described in (Patent Literature 2) preferentially learns non-conforming training data and adjusts various parameters related to the NN. For this reason, in the stage of re-learning all the subsequent training data, some of the training data that have been adapted so far may be changed to non-conforming. This tendency becomes conspicuous when, for example, the characteristics of conformity and nonconformity training data are greatly different, and it is considered that much time is required for re-learning.
 本発明は、上記の課題を鑑みてなされたものであり、その目的は、ユーザが分類困難な訓練データや、特性が大きく異なる訓練データに対しても、安定的にNNの学習時間を短縮可能とする、学習装置および学習方法を提供することである。 The present invention has been made in view of the above-mentioned problems, and its purpose is to stably shorten the learning time of NN even for training data that is difficult for a user to classify or training data having greatly different characteristics. And providing a learning device and a learning method.
上記課題を解決するために、例えば特許請求の範囲に記載の構成を採用する。本願は上記課題を解決する手段を複数含んでいるが、その一例を挙げるならば、ニューラルネットワークを複数のニューラルネットワークに分割するニューラルネットワーク分割部と、前記ニューラルネットワークの学習に用いる複数のサンプルからなる訓練データを、複数の訓練データに分割する訓練データ分割部と、前記複数の訓練データの各々に対して、前記複数のニューラルネットワークのなかのいずれかを一意に割り当て、割り当てた前記ニューラルネットワークによる学習を実行する分割ニューラルネットワーク学習部と、前記学習の実行結果が所定の条件を満たす前記複数のニューラルネットワークを統合し、統合ニューラルネットワークを生成するニューラルネットワーク統合部と、分割前の前記訓練データを用いて前記統合ニューラルネットワークによる学習を実行する統合ニューラルネットワーク学習部とを備えることを特徴とする学習装置を提示する。 In order to solve the above problems, for example, the configuration described in the claims is adopted. The present application includes a plurality of means for solving the above-described problems. For example, the application includes a neural network dividing unit that divides a neural network into a plurality of neural networks, and a plurality of samples used for learning the neural network. A training data dividing unit that divides training data into a plurality of training data, and any one of the plurality of neural networks is uniquely assigned to each of the plurality of training data, and learning by the assigned neural network Using the divided neural network learning unit for executing the learning, the neural network integration unit for generating an integrated neural network by integrating the plurality of neural networks in which the execution result of the learning satisfies a predetermined condition, and the training data before the division The above Presenting learning device characterized by comprising an integrated neural network learning portion that performs learning by the neural network.
 本発明によれば、ニューラルネットワークの学習フェーズにおいて、ユーザが分類困難な訓練データや、特性が大きく異なる訓練データに対しても、安定的にNNの学習時間を短縮することが可能となる。また、NNの形成に必要なリソースを最小化することが可能である。 According to the present invention, in the learning phase of the neural network, the NN learning time can be stably reduced even for training data that is difficult for the user to classify or training data having greatly different characteristics. Further, it is possible to minimize the resources necessary for forming the NN.
情報処理装置の構成を説明するブロック図である。It is a block diagram explaining the structure of information processing apparatus. 情報処理装置の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of information processing apparatus. ニューラルネットワークの構成を説明するブロック図である。It is a block diagram explaining the structure of a neural network. ニューラルネットワークの構成を説明するブロック図である。It is a block diagram explaining the structure of a neural network. ニューラルネットワークに与える訓練データを説明する表である。It is a table | surface explaining the training data given to a neural network. 情報処理装置の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of information processing apparatus. 情報処理装置の構成を説明するブロック図である。It is a block diagram explaining the structure of information processing apparatus. 情報処理装置の構成を説明するブロック図である。It is a block diagram explaining the structure of information processing apparatus. 情報処理装置の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of information processing apparatus. 情報処理装置の効果を説明する時系列グラフである。It is a time series graph explaining the effect of an information processor. 訓練データの分類方法を説明する表である。It is a table | surface explaining the classification method of training data. 情報処理装置の動作を説明するフローチャートである。It is a flowchart explaining operation | movement of information processing apparatus.
 以下、実施例を図面を用いて説明する。 Hereinafter, examples will be described with reference to the drawings.
 実施例1では、NNを2分割して別々の訓練データを用いて学習させ、その後、統合したNNで全訓練データを再学習させることで、学習時間を短縮する方法を示す。以下、図1~図6を用いてこの方法を説明する。 Example 1 shows a method of shortening the learning time by dividing the NN into two parts, learning using different training data, and then re-learning all training data with the integrated NN. Hereinafter, this method will be described with reference to FIGS.
 図1は、実施例1の情報処理装置の機能ブロック図であり、大別してNN学習装置と記憶装置から構成される。図1において、101はNN学習装置、102はNN分割部、103は訓練データ分割部、104~105は分割NN学習部、106はNN統合部、107は統合NN学習部である。また、108は記憶装置、109はNN情報記憶部、110は分割情報記憶部、111は訓練データ記憶部、112は学習結果記憶部であり、これらの記憶部にある情報は、全てユーザから読み書きできるものとする。図示しないが、NN学習装置101は、ハードウェア構成としてプロセッサ及びメモリを備える。プロセッサがメモリに格納されるプログラムを実行することによって、NN学習装置101の各種機能を実現できる。 FIG. 1 is a functional block diagram of the information processing apparatus according to the first embodiment, which is roughly composed of an NN learning device and a storage device. In FIG. 1, 101 is an NN learning device, 102 is an NN dividing unit, 103 is a training data dividing unit, 104 to 105 are divided NN learning units, 106 is an NN integrating unit, and 107 is an integrated NN learning unit. Reference numeral 108 denotes a storage device, 109 denotes an NN information storage unit, 110 denotes a division information storage unit, 111 denotes a training data storage unit, and 112 denotes a learning result storage unit. All the information in these storage units is read and written by the user. It shall be possible. Although not shown, the NN learning device 101 includes a processor and a memory as a hardware configuration. Various functions of the NN learning device 101 can be realized by executing a program stored in the memory by the processor.
 次に、実施例1の情報処理装置の動作について、図2のフローチャートを用いて説明する。図2において、201はNN分割、202は訓練データ分割、203は分割NN学習、204は設定変更、205はNN統合、206は統合NN学習である。 Next, the operation of the information processing apparatus according to the first embodiment will be described with reference to the flowchart of FIG. In FIG. 2, 201 is NN division, 202 is training data division, 203 is division NN learning, 204 is setting change, 205 is NN integration, and 206 is integrated NN learning.
 NNの学習にあたり、まずNN分割201が実行される。この処理は、NN分割部102によって実現され、分割情報記憶部110から与えられる分割情報に基づき、NN情報記憶部109から与えられるNNを2分割する。以下、NNの分割方法について説明する。まず、分割情報記憶部110から与えられる分割情報は、NNをどの様な比率で何分割するかの情報を含む。実施例1においては「NNを均等に2分割し、その一方を分割NN学習部104、他方を分割NN学習部105にアサインする」という情報が含まれているものとする。情報が含まれているものとする。次に、NN情報記憶部109から与えられるNNのイメージを図3に示す。図3において、301は入力ブロック、302~303は第1層目の特徴抽出ブロック、304~305は第2層目の特徴抽出ブロック、306は特徴抽出間ブロック間のリンク、307は出力ブロックである。なお、実施例1のNNは、その形態として畳み込みNNを想定しており、1つの特徴抽出ブロックは、畳み込み層(頭文字cのユニット)とプーリング層(頭文字pのユニット)から形成される。よって、特徴抽出ブロックのそれぞれは、訓練データの特徴を独立的に抽出する機能を有している。この機能を勘案すると、NNを分割するには、特徴抽出ブロックを単位とする方法が効率的であり、その結果、NN統合後の再学習の時間も短縮できると考えることができる。図4は、上記の考え方に従って図3のNNを2等分した例である。2等分であることから、基本的には特徴抽出ブロックを単位として上部と下部で分離するが、入力ブロックと出力ブロックについては共通ブロックであるため、複製した後に分離している。また、ブロック間のリンクについても図3で示したリンクを継承するが、分割されたNN(以下、分割NN)間のリンク、つまり図3で示したリンク306については削除する。以上、説明した構成と動作により、NNの分割処理201を実現することができる。 When learning the NN, first, the NN division 201 is executed. This processing is realized by the NN dividing unit 102, and NN given from the NN information storage unit 109 is divided into two based on the division information given from the division information storage unit 110. Hereinafter, the NN dividing method will be described. First, the division information given from the division information storage unit 110 includes information on how many divisions the NN is divided at. In the first embodiment, it is assumed that the information that “NN is equally divided into two, one of which is assigned to the divided NN learning unit 104 and the other is assigned to the divided NN learning unit 105” is included. Information shall be included. Next, an image of the NN given from the NN information storage unit 109 is shown in FIG. In FIG. 3, 301 is an input block, 302 to 303 are first layer feature extraction blocks, 304 to 305 are second layer feature extraction blocks, 306 is a link between feature extraction blocks, and 307 is an output block. is there. Note that the NN of the first embodiment assumes a convolution NN as its form, and one feature extraction block is formed of a convolution layer (unit of initial c) and a pooling layer (unit of initial p). . Therefore, each of the feature extraction blocks has a function of independently extracting the features of the training data. Considering this function, it can be considered that the method of using the feature extraction block as a unit is efficient for dividing the NN, and as a result, the relearning time after the NN integration can be shortened. FIG. 4 is an example in which the NN of FIG. 3 is divided into two equal parts according to the above concept. Since it is divided into two equal parts, the upper part and the lower part are basically separated in units of feature extraction blocks. However, since the input block and the output block are common blocks, they are separated after duplication. Further, the link shown in FIG. 3 is inherited for the link between blocks, but the link between divided NNs (hereinafter, divided NN), that is, the link 306 shown in FIG. 3 is deleted. As described above, the NN division processing 201 can be realized by the configuration and operation described above.
 次に、訓練データ分割202が実行される。この処理は、訓練データ分割部103によって実現され、分割情報記憶部110から与えられる分割情報に基づき、訓練データ記憶部110から与えられる訓練データを2分割する。以下、訓練データの分割方法について説明する。まず、分割情報記憶部110から与えられる分割情報は、訓練データをどのようにグルーピングして分割するかの情報を含む。実施例1においては「訓練データをサンプル番号の前半と後半で2等分し、その前半を分割NN学習部104、後半を分割NN学習部105にアサインする」という情報が含まれているものとする。次に、訓練データ記憶部111から与えられる訓練データのイメージを図5に示す。図5に示すように、訓練データは入力ベクトルiv1~iv4と出力ベクトルov1~ov2のセットで1サンプルが構成され、サンプルの数は400個用意される。したがって、上記の指示に従って訓練データを2分割する場合、サンプル番号1~200および201~400の2グループとなるように分割すれば良い。なお、実施例1においては、サンプル番号に対して入力ベクトルの値が連続的に変化していること想定している。この場合、サンプル番号の前半と後半でグルーピングすれば、各グループの訓練データの特徴に大きな差異が生じる可能性が高まる。そして、これらの訓練データを別々に学習することで、より識別能力の高い分割NNが生成可能となり、NN統合後の再学習の時間も短縮できると考えることができる。なお、当然ながら、サンプル番号に対して入力ベクトルの値が連続的に変化していない場合が考えられ、その場合の分割方法については、実施例2以降で説明する。 Next, training data division 202 is executed. This process is realized by the training data dividing unit 103 and divides the training data given from the training data storage unit 110 into two based on the division information given from the division information storage unit 110. Hereinafter, a method for dividing training data will be described. First, the division information given from the division information storage unit 110 includes information on how to group and divide the training data. In the first embodiment, information that “divides training data into two equal parts in the first half and the second half of the sample number and assigns the first half to the divided NN learning unit 104 and the second half to the divided NN learning unit 105” is included. To do. Next, an image of training data given from the training data storage unit 111 is shown in FIG. As shown in FIG. 5, in the training data, one sample is composed of a set of input vectors iv1 to iv4 and output vectors ov1 to ov2, and 400 samples are prepared. Therefore, when the training data is divided into two according to the above instructions, the training data may be divided into two groups of sample numbers 1 to 200 and 201 to 400. In the first embodiment, it is assumed that the value of the input vector continuously changes with respect to the sample number. In this case, if the first half and the second half of the sample numbers are grouped, the possibility that a large difference will occur in the characteristics of the training data of each group increases. Then, by separately learning these training data, it is possible to generate divided NNs with higher discrimination ability, and it can be considered that the time for relearning after NN integration can also be shortened. Of course, there may be a case where the value of the input vector does not continuously change with respect to the sample number, and the division method in this case will be described in the second and subsequent embodiments.
 NNと訓練データの分割が完了すると、次に分割NN学習203が実行される。この処理は、分割NN学習部104~105によって実現され、分割された訓練データを用いて分割NNの学習を並列に行う。以下、図4における上部の分割NNを例にとり、その動作を、図6のフローチャートを用いて説明する。図6において、601は初期設定、602は先頭訓練データ入力、603はNN演算、604は誤差算出、605は訓練データ更新、606はパラメータ調整、607は結果リスト生成である。分割NNの学習にあたり、まず初期設定601が実行される。この処理は、リンクの結合係数とバイアス値の初期設定が主であり、予め定められた値として、例えば-1から+1までの乱数を初期値として設定する。初期設定が完了すると、次に先頭訓練データ入力602が実行される。具体的には、図5に示した訓練データのサンプル番号:1が選択され、その入力ベクトルが分割NNの入力ブロック401に、出力ベクトルが出力ブロック403に転送される。次に、NN演算603が実行される。この処理は畳み込みNNにおける「前向き演算」に相当し、入力された訓練データに対し、畳み込み(フィルタリング)、プーリング(部分サンプリング)、活性化等の各種処理を行い、最終的に出力ブロック403内の各ユニットの出力値を算出する。誤差算出604は、左記の算出結果と訓練データの出力ベクトルとを比較し、その誤差を算出する。その後、訓練データ更新605が実行されるが、この処理は、実施例1においては、次のサンプル番号の訓練データを選択する動作に相当する。そして、この訓練データに対してNN演算603と誤差算出604が再度実行され、これらの一連の処理を全ての訓練データの演算が完了するまで繰り返す。その後、誤差の総和が予め定めた基準以下であるならば、学習に成功したと判定し、結果リスト作成606を実行して学習は終了する。なお、結果リストには、学習の成功または失敗を示す情報の他、訓練データ毎の誤差、分割NNに係わる各種パラメータ設定値の最新情報などが含まる。これらは、学習結果記憶部112に転送され、ユーザの学習結果の解析などに活用される。一方、誤差の総和が基準以上の場合は、パラメータ調整607が実行されるが、この処理は畳み込みNNにおける「後向き演算」に相当し、誤差伝播法を用いて出力ブロック403における誤差の勾配を、入力ブロック401の方向に逆伝播させる。これにより、各リンクの結合係数とバイアス値を、誤差が小さくなる方向に修正することが可能である。パラメータ調整607の後は、再び訓練データを用いた「前向き演算」を実行し、誤差の総和が基準以下になるまでパラメータ調整607を繰り返す。もしこの動作の途中、処理時間が予め定めた基準を超過した場合には、学習に失敗したと判定し、結果リスト作成606を実施して学習は終了する。 When the division of the NN and the training data is completed, the divided NN learning 203 is executed next. This process is realized by the divided NN learning units 104 to 105, and learning of the divided NN is performed in parallel using the divided training data. Hereinafter, the operation of the upper divided NN in FIG. 4 will be described as an example with reference to the flowchart of FIG. In FIG. 6, 601 is an initial setting, 602 is a head training data input, 603 is an NN calculation, 604 is an error calculation, 605 is training data update, 606 is parameter adjustment, and 607 is a result list generation. When learning the divided NN, initial setting 601 is first executed. This process mainly involves initial setting of the link coupling coefficient and bias value. For example, a random number from −1 to +1 is set as an initial value as a predetermined value. When the initial setting is completed, the leading training data input 602 is executed next. Specifically, the training data sample number 1 shown in FIG. 5 is selected, and the input vector is transferred to the input block 401 of the divided NN and the output vector is transferred to the output block 403. Next, the NN operation 603 is executed. This process corresponds to a “forward operation” in the convolution NN, and performs various processes such as convolution (filtering), pooling (partial sampling), and activation on the input training data, and finally the output block 403 Calculate the output value of each unit. The error calculation 604 compares the calculation result on the left with the output vector of the training data, and calculates the error. Thereafter, training data update 605 is executed. This processing corresponds to an operation of selecting training data of the next sample number in the first embodiment. Then, the NN calculation 603 and the error calculation 604 are executed again on the training data, and these series of processes are repeated until the calculation of all the training data is completed. Thereafter, if the total sum of errors is less than or equal to a predetermined reference, it is determined that learning has succeeded, result list creation 606 is executed, and learning ends. The result list includes information indicating success or failure of learning, error for each training data, latest information on various parameter setting values related to the divided NN, and the like. These are transferred to the learning result storage unit 112 and used for analysis of the learning result of the user. On the other hand, when the sum of errors is equal to or larger than the reference, parameter adjustment 607 is executed. This process corresponds to “backward calculation” in the convolution NN, and the error propagation in the output block 403 is calculated using the error propagation method. Back propagate in the direction of the input block 401. As a result, it is possible to correct the coupling coefficient and bias value of each link so that the error becomes smaller. After the parameter adjustment 607, the “forward calculation” using the training data is executed again, and the parameter adjustment 607 is repeated until the total sum of errors is below the reference. If the processing time exceeds a predetermined standard during this operation, it is determined that learning has failed, result list creation 606 is performed, and learning ends.
 それぞれの分割NNを学習させた結果として、もし学習に失敗した分割NNがある場合、設定変更204が実行される。実施例1において、この処理はユーザによって実施され、例えば分割NNの初期設定601を変更したり、分割NNの層数やユニット数を増やしたり、分割NNと訓練データをさらに分割させることを目的に、NN情報記憶部109の内容やNN分割情報記憶部110の内容を修正する。その後、分割NNの学習を再度実行し、これらの一連の処理を全ての分割NNの学習が成功するまで繰り返す。 As a result of learning each divided NN, if there is a divided NN that has failed in learning, a setting change 204 is executed. In the first embodiment, this process is performed by the user, for example, for changing the initial setting 601 of the divided NN, increasing the number of layers and units of the divided NN, and further dividing the divided NN and training data. The contents of the NN information storage unit 109 and the contents of the NN division information storage unit 110 are corrected. Thereafter, learning of the divided NN is executed again, and these series of processes are repeated until learning of all the divided NNs is successful.
 全ての分割NNの学習が成功すると、次にNN統合205が実行される。この処理はNN統合部106によって実現され、まず、複製した入力ブロック401と402、出力ブロック403と404を、それぞれ1つのブロックに共通化する。さらに、NNの分割処理で削除したリンク306を復活させる。この処理により、NNを統合することが可能である。 When learning of all the divided NNs is successful, NN integration 205 is executed next. This processing is realized by the NN integration unit 106. First, the duplicated input blocks 401 and 402 and output blocks 403 and 404 are shared by one block. Further, the link 306 deleted by the NN dividing process is restored. With this process, it is possible to integrate the NN.
 分割NNの統合後は、統合NN学習206が実行される。この処理は統合NN学習部107によって実現され、その内容は部分NN学習203とほぼ同じである。異なる処理は、統合NNにおける各リンクの初期値に、分割NN学習203で調整された結合係数とバイアス値を反映させる点である。なお、復活させたリンク306の初期値については、学習の結果が無いため、例えば-1から+1までの乱数とする。 Integral NN learning 206 is executed after the integration of the divided NNs. This process is realized by the integrated NN learning unit 107, and the content thereof is almost the same as that of the partial NN learning 203. The different processing is that the coupling coefficient and the bias value adjusted by the divided NN learning 203 are reflected on the initial value of each link in the integrated NN. Note that the initial value of the restored link 306 is a random number from −1 to +1, for example, because there is no learning result.
 統合NNを学習させた結果、もし学習に失敗した場合、設定変更204が再び実行され、この一連の処理を統合NNの学習が成功するまで繰り返す。なお、統合NNに係わる各種パラメータ設定値の最新情報は、NN情報記憶部109に転送され、学習フェーズ後のテストフェーズに活用される。 As a result of learning the integrated NN, if the learning fails, the setting change 204 is executed again, and this series of processing is repeated until the integrated NN is successfully learned. The latest information on various parameter setting values related to the integrated NN is transferred to the NN information storage unit 109 and used in the test phase after the learning phase.
 以上説明したように、実施例1の情報処理装置は、特徴抽出ブロックを単位として畳み込みNNを2分割し、サンプル番号の前半と後半で分割した訓練データ用いて並列に分割NNを学習させ、その後、統合したNNで全訓練データを再学習させる。これにより、サンプル番号に対して入力ベクトルの値が連続的に変化しているような訓練データに対し、識別能力の高い分割NNが生成可能となり、NN統合後の再学習の時間も短縮することが可能である。したがって、本発明の目的である、NNの学習時間を短縮可能とする、情報処理方法およびそのシステムを提供することが可能である。 As described above, the information processing apparatus according to the first embodiment divides the convolution NN into two by using the feature extraction block as a unit, and learns the division NN in parallel using the training data divided in the first half and the second half of the sample number, and then Retrain all training data with the integrated NN. This makes it possible to generate a segmented NN with high identification capability for training data in which the value of the input vector changes continuously with respect to the sample number, and shorten the time for relearning after NN integration. Is possible. Therefore, it is possible to provide an information processing method and system capable of shortening the NN learning time, which is an object of the present invention.
 なお、実施例1においてNNおよび訓練データの分割数は2としたが、これに限られる訳ではなく、3以上に設定しても良い。また、NNおよび訓練データの分割のバランスを均等としたが、これに限られる訳ではなく、不均等に設定しても良い。さらに、訓練データのサンプル数、入出力ベクトルの要素数なども自由に設定可能である。 In the first embodiment, the number of divisions of the NN and the training data is two. However, the number of divisions is not limited to this, and may be set to three or more. Moreover, although the balance of the division of the NN and the training data is made equal, the present invention is not limited to this, and may be set unevenly. Furthermore, the number of training data samples, the number of input / output vector elements, and the like can be freely set.
 実施例2では、訓練データの特性が未知の場合に、訓練データを適切に分割するための方法を示す。以下、新たに図7を追加してこの方法を説明する。図7は実施例1の情報処理装置の機能ブロック図であり、701は分割情報記憶部、702はNN分析部である。それ以外の部は実施例1の図1で示した部と同じ処理を行うため、同じ番号を記している。 Example 2 shows a method for appropriately dividing training data when the characteristics of the training data are unknown. Hereinafter, this method will be described with reference to FIG. FIG. 7 is a functional block diagram of the information processing apparatus according to the first embodiment, in which 701 is a division information storage unit and 702 is an NN analysis unit. Since the other parts perform the same processing as the part shown in FIG. 1 of the first embodiment, the same numbers are given.
 実施例2の処理としての特徴は、図2に示した訓練データ分割202に、訓練データの分析処理が新たに追加された点にある。データ分割202は、図7に示すNN分析部702によって実現され、分割情報記憶部701から与えられる分割情報に基づき、訓練データ記憶部111から与えられる訓練データを分割する。ここで、分割情報記憶部701から与えられる分割情報は、訓練データをどの様な比率で何分割するかの情報を含む。実施例2においては「訓練データを均等に2分割し、その一方を分割NN学習部104、他方を分割NN学習部105にアサインする」という情報が含まれているものとする。次に、NN分析部702は、訓練データを均等に2分割するための分析を実行する。実施例2においては、この分析方法として、ユークリッド距離の近さで訓練データを分類する、いわゆるクラスタリングを用いる。代表的なアルゴリズムとしては、例えばK平均法などが挙げられる。クラスタリングによる分析により、訓練データが2つのグループに分割され、さらに、どのサンプル番号の訓練データがどのグループに属するかの情報も生成される。 The feature of the processing of the second embodiment is that a training data analysis process is newly added to the training data division 202 shown in FIG. The data division 202 is realized by the NN analysis unit 702 illustrated in FIG. 7, and divides the training data provided from the training data storage unit 111 based on the division information provided from the division information storage unit 701. Here, the division information given from the division information storage unit 701 includes information on how many divisions the training data is divided at what ratio. In the second embodiment, it is assumed that information that “the training data is equally divided into two and one of them is assigned to the divided NN learning unit 104 and the other is assigned to the divided NN learning unit 105” is included. Next, the NN analysis unit 702 executes analysis for equally dividing the training data into two. In the second embodiment, as this analysis method, so-called clustering that classifies training data based on the proximity of the Euclidean distance is used. As a typical algorithm, for example, a K-average method can be cited. The analysis by clustering divides the training data into two groups, and further generates information about which sample number of training data belongs to which group.
 なお、上記した訓練データ分割202の処理以外は、実施例1と同じの構成と動作となるため、その説明は割愛する。 In addition, since it becomes the same structure and operation | movement as Example 1 except the process of the above training data division | segmentation 202, the description is omitted.
 以上説明したように、実施例2の情報処理装置は、畳み込みNNを2分割し、クラスタリング分析により2分割した訓練データ用いて並列に分割NNを学習させ、その後、統合したNNで全訓練データを再学習させる。これにより、特性が未知の訓練データに対し、識別能力の高い分割NNが生成可能となり、NN統合後の再学習の時間も短縮することが可能である。したがって、本発明の目的である、ユーザが分類困難な訓練データや、特性が大きく異なる訓練データに対しても、安定的にNNの学習時間を短縮可能とする、情報処理方法およびそのシステムを提供することが可能である。 As described above, the information processing apparatus according to the second embodiment divides the convolution NN into two parts, learns the divided NNs in parallel using the training data divided into two parts by clustering analysis, and then all the training data with the integrated NNs. Let them learn again. Thereby, it is possible to generate a divided NN with high identification ability for training data whose characteristics are unknown, and it is possible to shorten the time for relearning after NN integration. Therefore, an object of the present invention is to provide an information processing method and system capable of stably reducing the learning time of an NN even for training data that is difficult for a user to classify or training data having greatly different characteristics. Is possible.
 なお、実施例1と同様、NNおよび訓練データの分割数やバランス、訓練データのサンプル数、入出力ベクトルの要素数などは自由に設定可能である。さらに、実施例1と実施例2を、異なる処理モードとして切り替えることも可能である。これにより、より多様な特性の訓練データに対し、安定的にNNの学習時間を短縮することが可能となる。 As in the first embodiment, the number and balance of NN and training data, the number of training data samples, the number of input / output vector elements, and the like can be freely set. Furthermore, it is possible to switch between the first embodiment and the second embodiment as different processing modes. This makes it possible to stably shorten the learning time of the NN with respect to training data having more various characteristics.
 実施例3では、訓練データの特性が未知の場合に、訓練データを適切に分割するための別の方法を示す。以下、新たに図8~図9を追加してこの方法を説明する。図8は実施例3の情報処理装置の機能ブロック図であり、801は分割情報記憶部、802は分割調整部である。それ以外の部は実施例1の図1で示した部と同じ処理を行うため、同じ番号を記している。 Example 3 shows another method for appropriately dividing training data when the characteristics of the training data are unknown. Hereinafter, this method will be described with reference to FIGS. FIG. 8 is a functional block diagram of the information processing apparatus according to the third embodiment, in which 801 is a division information storage unit and 802 is a division adjustment unit. Since the other parts perform the same processing as the part shown in FIG. 1 of the first embodiment, the same numbers are given.
 実施例3の処理としての特徴は、NNを分類器として活用し、訓練データを適応的に分類する点にある。以下、この処理の詳細について、図9のフローチャートを用いて説明する。図9において、901は初期NN学習、902は不適合訓練データ抽出、903はNN追加、904は追加NN学習、905はNN統合、906は統合NN学習、907は結果リスト作成である。 The feature of the processing of the third embodiment is that the training data is adaptively classified by using the NN as a classifier. Details of this processing will be described below with reference to the flowchart of FIG. In FIG. 9, 901 is initial NN learning, 902 is non-conformance training data extraction, 903 is NN addition, 904 is additional NN learning, 905 is NN integration, 906 is integrated NN learning, and 907 is a result list creation.
 NNの学習にあたり、まず初期NN学習901が実行される。この処理は実施例1の分割NN学習203とほぼ同じであるが、1つの分割NN、例えば図4における上部の分割NNで全ての訓練データを学習させる点に特徴がある。この処理を実現するには、分割調整部802が出力する分割情報を工夫すれば良い。例えばNN分割部102へ出力する分割情報を「NNを均等に2分割し、その一方を分割NN学習部104にアサインする」とし、訓練データ分割部103への分割情報を「全ての訓練データを分割NN学習部104にアサインする」とすることで実現可能である。 In the NN learning, first, the initial NN learning 901 is executed. This process is almost the same as the divided NN learning 203 of the first embodiment, but is characterized in that all training data is learned by one divided NN, for example, the upper divided NN in FIG. In order to realize this process, the division information output by the division adjustment unit 802 may be devised. For example, the division information to be output to the NN division unit 102 is “divide NN equally into two and assign one of them to the division NN learning unit 104”. This can be realized by assigning to the divided NN learning unit 104.
 初期NN学習901の結果、学習が成功したならば、結果リスト作成907を実行して学習は終了するが、失敗の場合は不適合訓練データ抽出902が実行される。この処理は分割調整部802によって実現され、先の初期NN学習901で算出された誤差を訓練データ毎に調べ、誤差が予め定めた基準以上の訓練データを、不適合訓練データとして抽出する。不適合訓練データの抽出後は、NN追加903および追加NN学習904が実行されるが、これに先立ち、もしNNの形成に必要なリソースが確保できるか、すなわち追加可能な未学習の分割NNがあるか調査し、もしNNが追加できない場合には、結果リスト作成907を実行して学習は終了する。 If the learning is successful as a result of the initial NN learning 901, the result list creation 907 is executed and the learning is completed, but if it is unsuccessful, the non-conforming training data extraction 902 is executed. This process is realized by the division adjustment unit 802, and the error calculated in the initial initial NN learning 901 is checked for each training data, and training data whose error is equal to or more than a predetermined standard is extracted as non-conforming training data. After extracting the non-conforming training data, NN addition 903 and additional NN learning 904 are executed. Prior to this, if necessary, resources necessary for forming the NN can be secured, that is, there is an unlearned divided NN that can be added. If the NN cannot be added, the result list creation 907 is executed and the learning ends.
 NN追加903および追加NN学習904の処理は、実施例1の分割NN学習203とほぼ同じであるが、上記とは別の分割NN、例えば図4における下部の分割NNで不適合訓練データを学習させる点に特徴がある。この処理を実現するには、分割調整部802が出力する分割情報を更新すれば良い。例えばNN分割部102へ出力する分割情報を「2分割した残りの分割NNを分割NN学習部105にアサインする」とし、訓練データ分割部103への分割情報を「不適合訓練データを分割NN学習部105にアサインする」とすることで実現可能である。追加NNを学習させた結果、学習が失敗した場合は、不適合訓練データ抽出902が再び実行され、NNを追加して不適合訓練データを学習する一連の処理を、追加NNの学習が成功するかリソースが確保できなくなるまで繰り返す。 The processing of the NN addition 903 and the additional NN learning 904 is almost the same as the divided NN learning 203 of the first embodiment, but the non-conforming training data is learned by a different divided NN, for example, the lower divided NN in FIG. There is a feature in the point. In order to realize this process, the division information output from the division adjustment unit 802 may be updated. For example, the division information to be output to the NN division unit 102 is “assign the remaining division NN divided into two to the division NN learning unit 105”, and the division information to the training data division unit 103 is “divide non-conforming training data into NN learning unit” It can be realized by “assign to 105”. If the learning fails as a result of learning the additional NN, the non-conforming training data extraction 902 is executed again, and a series of processes for learning the non-conforming training data by adding the NN is performed. Repeat until it becomes impossible to secure.
 追加NNの学習が成功した場合、NN統合905および統合NN学習906が実行される。これらの処理は、実施例1のNN統合および統合NN学習206と同様である。実施例3では、未学習の分割NNも存在するが、未学習の分割NNは統合せず、学習させた分割NN(追加NN)を統合する。そして、全ての訓練データを用いて統合NNを学習させた結果、学習が失敗した場合は、不適合訓練データ抽出902が再び実行され、NNを追加して不適合訓練データを学習する一連の処理を、統合NNの学習が成功するかリソースが確保できなくなるまで繰り返す。なお、実施例1と同様、統合NNに係わる各種パラメータ設定値の最新情報は、NN情報記憶部109に転送され、学習フェーズ後のテストフェーズに活用される。また、結果リスト作成907で生成される結果リストには、学習の成功または失敗を示す情報の他、訓練データ毎の誤差、分割NNに係わる各種パラメータ設定値の最新情報などが含まれる。これらは、学習結果記憶部112に転送され、ユーザの学習結果の解析などに活用される。 When the learning of the additional NN is successful, the NN integration 905 and the integrated NN learning 906 are executed. These processes are the same as the NN integration and the integrated NN learning 206 of the first embodiment. In the third embodiment, there is an unlearned division NN, but the unlearned division NN is not integrated, and the learned division NN (additional NN) is integrated. And as a result of having learned integrated NN using all the training data, when learning fails, nonconformity training data extraction 902 is performed again, and a series of processing which learns nonconformity training data by adding NN, Iterate until integrated NN learning is successful or resources cannot be secured. As in the first embodiment, the latest information on various parameter setting values related to the integrated NN is transferred to the NN information storage unit 109 and used in the test phase after the learning phase. Further, the result list generated by the result list creation 907 includes information indicating success or failure of learning, error for each training data, latest information on various parameter setting values related to the divided NN, and the like. These are transferred to the learning result storage unit 112 and used for analysis of the learning result of the user.
 以上説明したように、実施例3の情報処理装置は、畳み込みNNを2分割し、片方の分割NNを用いて訓練データを適合と不適合に分類し、不適合の訓練データについては、他方の分割NNを用いて訓練データを再学習させ、その後、統合したNNで全訓練データを再学習させる。これにより、特性が未知の訓練データに対し、識別能力の高い分割NNが生成可能となり、NN統合後の再学習の時間も短縮することが可能である。したがって、本発明の目的である、ユーザが分類困難な訓練データや、特性が大きく異なる訓練データに対しても、安定的にNNの学習時間を短縮可能とする、情報処理方法およびそのシステムを提供することが可能である。さらに、学習結果に応じて適応的にNNを追加していく方法となるため、NNの形成に必要なリソースを節約することが可能である。さらに、実施例2で示した訓練データをクラスタリングする方法と比べ、実施例3の方法はより直接的であることから、より多様な特性の訓練データに対し、安定的にNNの学習時間を短縮することが可能となる。 As described above, the information processing apparatus according to the third embodiment divides the convolution NN into two parts, classifies the training data as conforming and nonconforming using one of the subdividing NNs. Is used to re-learn the training data, and then all the training data is re-learned with the integrated NN. Thereby, it is possible to generate a divided NN with high identification ability for training data whose characteristics are unknown, and it is possible to shorten the time for relearning after NN integration. Therefore, an object of the present invention is to provide an information processing method and system capable of stably reducing the learning time of an NN even for training data that is difficult for a user to classify or training data having greatly different characteristics. Is possible. Furthermore, since the NN is adaptively added according to the learning result, resources necessary for forming the NN can be saved. Furthermore, compared with the method of clustering the training data shown in the second embodiment, the method of the third embodiment is more direct, so the training time for the NN can be stably reduced for training data with more diverse characteristics. It becomes possible to do.
 なお、実施例1と同様、NNおよび訓練データの分割数やバランス、訓練データのサンプル数、入出力ベクトルの要素数などは自由に設定可能である。さらに、実施例1~3を、異なる処理モードとして切り替えることも可能である。これにより、さらに多様な特性の訓練データに対し、安定的にNNの学習時間を短縮することが可能となる。 As in the first embodiment, the number and balance of NN and training data, the number of training data samples, the number of input / output vector elements, and the like can be freely set. Further, the first to third embodiments can be switched as different processing modes. This makes it possible to stably shorten the NN learning time for training data having various characteristics.
 なお、実施例3では初期NN学習901において、全ての訓練データを学習させて不適合訓練データを抽出したが、これに限られる訳ではない。例えば、訓練データの一部を無作為に抽出して初期NNを事前学習させておき、その後、全ての訓練データを学習させて不適合訓練データを抽出することも可能である。この処理により、より識別能力の高い初期NNが生成できるものと期待できる。この方法を用いて、実際にNNを学習した結果を図10に示す。分割学習を実施しない方法と比べて、本発明は極めて少ない学習の繰り返し回数で、学習を成功させていることが判る。 In Example 3, in the initial NN learning 901, all training data is learned and non-conforming training data is extracted. However, the present invention is not limited to this. For example, it is also possible to extract part of the training data at random and let the initial NN be pre-learned, and then to learn all the training data and extract the non-conforming training data. With this process, it can be expected that an initial NN with higher discrimination ability can be generated. FIG. 10 shows the result of actually learning NN using this method. It can be seen that the present invention succeeds in learning with a very small number of learning repetitions compared to a method that does not perform divided learning.
 実施例4では、実施例1~3の情報処理装置において、NNの分割や追加をより効率的に実現するための方法を示す。実施例4の処理としての特徴は、誤差算出604で得られる誤差の大きさに応じて訓練データを分類すると共に、この結果に応じてNNの分割や追加の方針を決定する点にある。以下、新たに図11を追加してこの方法を説明する。 Embodiment 4 shows a method for more efficiently realizing the division and addition of NNs in the information processing apparatuses of Embodiments 1 to 3. The feature of the processing of the fourth embodiment is that the training data is classified according to the magnitude of the error obtained by the error calculation 604, and the NN division and the additional policy are determined according to the result. Hereinafter, this method will be described with reference to FIG.
 図11は、誤差算出604で算出される誤差のリストの一例である。この誤差リストは、分割NN学習部104~105によって実現され、図11に示すように、例えば誤差の大きい順に訓練データをソーティングし、さらに誤差の大きさで分類を行っている。なお、図11においては、説明を簡略化するためサンプル数を20と少なくしている。 FIG. 11 is an example of a list of errors calculated by the error calculation 604. This error list is realized by the divided NN learning units 104 to 105. As shown in FIG. 11, for example, training data is sorted in descending order of error, and further classified according to the magnitude of error. In FIG. 11, the number of samples is reduced to 20 to simplify the description.
 図11の誤差リストを活用する例として、例えば実施例3におけるNN追加903への応用が考えられる。例えば図11のケースのように、不適合訓練データにおいて、誤差レベルが大きくばらつく場合、1つの追加NNで再学習しても学習が失敗する可能性が高い。これに対し、追加NNを3つ用意し、それぞれ誤差大、誤差中、誤差小の訓練データを別々に学習させた方が、学習の成功確率が高まると考えられる。この様に、「誤差の大きさ」という情報を得ることにより、NNの追加方針を効率的に決定すること可能である。この考え方は、実施例1~2で示したユーザによる設定変更204に対しても、極めて効果的である。 As an example of utilizing the error list of FIG. 11, for example, application to the NN addition 903 in the third embodiment is conceivable. For example, as in the case of FIG. 11, when the error level varies greatly in the non-conforming training data, there is a high possibility that learning will fail even if re-learning with one additional NN. On the other hand, it is considered that the success probability of learning increases when three additional NNs are prepared and training data with large error, medium error, and small error are separately learned. In this way, by obtaining the information “size of error”, it is possible to efficiently determine the NN addition policy. This concept is extremely effective for the setting change 204 by the user shown in the first and second embodiments.
 以上説明したように、実施例4の情報処理装置は、実施例1~3で示した構成と動作に加えて、訓練データの誤差の大きさに応じて、NNの分割や追加の方針を決定する。これにより、NNの学習の成功率をより高めることが可能となる。なお、図11で示した誤差レベルの分類数は4としたが、これに限られる訳ではなく、リソースの状況などを勘案して適切に設定しておくことが望ましい。 As described above, the information processing apparatus according to the fourth embodiment determines NN division and additional policies according to the magnitude of error in the training data in addition to the configuration and operation described in the first to third embodiments. To do. As a result, the success rate of NN learning can be further increased. Although the number of error level classifications shown in FIG. 11 is four, it is not limited to this, and it is desirable to set it appropriately in consideration of resource conditions and the like.
 実施例5では、実施例1~4の情報処理装置において、NNの形成に必要なリソースを最小化するための方法を示す。実施例5の処理としての特徴は、学習が成功したNNに対し、NN規模を縮小して再学習させ、学習が再び成功した場合は、余剰のNNリソースを累積する点にある。以下、新たに図12を追加してこの方法を説明する。図12は、リソース最小処理を実現するためのフローチャートであり、1201はNN規模縮小、1202は縮小NN学習、1203は余剰リソース累積である。なお、これらの処理は、分割NN学習部104および105で実現される。 Embodiment 5 shows a method for minimizing resources necessary for forming an NN in the information processing apparatuses of Embodiments 1 to 4. The characteristic of the processing of the fifth embodiment is that the NN scale is reduced and re-learning is performed on the NN that has succeeded in learning, and if the learning succeeds again, excess NN resources are accumulated. Hereinafter, this method will be described with reference to FIG. FIG. 12 is a flowchart for realizing the resource minimum processing, where 1201 is NN scale reduction, 1202 is reduced NN learning, and 1203 is surplus resource accumulation. These processes are realized by the divided NN learning units 104 and 105.
 まず、NN規模縮小1201は、学習が成功した分割NN、初期NN、または追加NNを入力とし、予め定められた分だけNNの規模を削減する。NN規模の削減方法としては、例えば、中間層単位、特徴抽出ブロック単位、ユニット単位での削減を考えることができる。次に、縮小NN学習1202が実行されるが、この処理は、実施例1の分割NN学習203と同じであるため、その説明は割愛する。 First, the NN scale reduction 1201 receives the divided NN, the initial NN, or the additional NN that has been successfully learned, and reduces the scale of the NN by a predetermined amount. As a method for reducing the NN scale, for example, reduction in units of intermediate layers, feature extraction blocks, or units can be considered. Next, the reduced NN learning 1202 is executed. This process is the same as the divided NN learning 203 of the first embodiment, and the description thereof is omitted.
 縮小NNの学習が成功した場合、余剰リソース累積1203が実行される。この処理では、NN規模縮小1201で削減した分のNNのリソースを、余剰リソースとして蓄積する。その後、NN規模縮小1201が再び実行され、この一連の処理を、縮小NNの学習が失敗するまで繰り返す。 When the learning of the reduced NN is successful, the surplus resource accumulation 1203 is executed. In this process, the NN resources reduced by the NN scale reduction 1201 are accumulated as surplus resources. Thereafter, the NN scale reduction 1201 is executed again, and this series of processing is repeated until learning of the reduction NN fails.
 そして、縮小NNの学習が失敗すると、NN規模をこれ以上縮小できないと判定し、蓄積された余剰リソースをNN情報記憶部109へ転送して、リソース最小化処理が完了となる。なお、蓄積した余剰リソースは、新たにNNを追加する必要が発生した場合などにリサイクルされる。 Then, if the learning of the reduced NN fails, it is determined that the NN scale cannot be reduced any more, the accumulated surplus resources are transferred to the NN information storage unit 109, and the resource minimization process is completed. The accumulated surplus resources are recycled when a new NN needs to be added.
 以上説明したように、実施例5の情報処理方法およびそのシステムは、実施例1~4で示した構成と動作に加えて、NNの形成に必要なリソースを最小化するためのリソース最小化処理を行う。これにより、より少ないリソースでNNの学習時間を短縮することが可能となる。また、限られたリソースにおいては、より多様な特性の訓練データに対し、安定的にNNの学習時間を短縮することが可能となる。 As described above, the information processing method and system thereof according to the fifth embodiment is a resource minimization process for minimizing the resources necessary for forming the NN in addition to the configurations and operations shown in the first to fourth embodiments. I do. As a result, the learning time of the NN can be shortened with fewer resources. Further, with limited resources, it is possible to stably shorten the learning time of the NN for training data with more diverse characteristics.
 以上、実施例1~5において、NNを畳み込みNNをとしたが、これに限定される訳ではなく、他の種類のNNにも適応可能である。ただし、他の種類のNNを適用する場合は、畳み込みNNのように、比較的NNを分割しやすい性質であることが望ましい。 As described above, in Embodiments 1 to 5, the NN is convoluted with the NN. However, the present invention is not limited to this and can be applied to other types of NN. However, when other types of NN are applied, it is desirable that the NN is relatively easy to divide, such as a convolution NN.
101 NN学習装置
102 NN分割部
103 訓練データ分割部
104 分割NN学習部
105 分割NN学習部
106 NN統合部
107 統合NN学習部
108 記憶装置
109 NN情報記憶部
110 分割情報記憶部
111 訓練データ記憶部
112 学習結果記憶部
201 NN分割
202 訓練データ分割
203 分割NN学習
204 設定変更
205 NN統合
206 統合NN学習
601 初期設定
602 先頭訓練データ入力
603 NN演算
604 誤差算出
605 訓練データ更新
606 結果リスト生成
607 パラメータ調整
701 分割情報記憶部
702 NN分析部
801 分割情報記憶部
802 分割調整部
901 初期NN学習
902 不適合訓練データ抽出
903 NN追加
904 追加NN学習
905 NN統合
906 統合NN学習
907 結果リスト生成
1201 NN規模縮小
1202 縮小NN学習
1203 余剰リソース蓄積
101 NN learning device 102 NN dividing unit 103 training data dividing unit 104 divided NN learning unit 105 divided NN learning unit 106 NN integrating unit 107 integrated NN learning unit 108 storage device 109 NN information storage unit 110 divided information storage unit 111 training data storage unit 112 Learning result storage unit 201 NN division 202 Training data division 203 Division NN learning 204 Setting change 205 NN integration 206 Integrated NN learning 601 Initial setting 602 First training data input 603 NN calculation 604 Error calculation 605 Training data update 606 Result list generation 607 Parameter Adjustment 701 Division information storage unit 702 NN analysis unit 801 Division information storage unit 802 Division adjustment unit 901 Initial NN learning 902 Incompatible training data extraction 903 NN addition 904 Additional NN learning 905 NN integration 906 Integration NN learning 907 Result Strike generated 1201 NN downsizing 1202 reduced NN learning 1203 surplus resources accumulated

Claims (14)

  1.  ニューラルネットワークによる学習を実行する学習装置であって、
     前記ニューラルネットワークを、第1と第2のニューラルネットワークとを含む複数のニューラルネットワークに分割するニューラルネットワーク分割部と、
     前記ニューラルネットワークの学習に用いる複数のサンプルからなる訓練データを第1と第2の訓練データとに分割する訓練データ分割部と、
     前記第1の訓練データを用いて前記第1のニューラルネットワークによる学習を実行し、前記第2の訓練データを用いて前記第2のニューラルネットワークによる学習を実行する分割ニューラルネットワーク学習部と、
     前記分割ニューラルネットワーク学習部での学習が成功した後の前記第1と第2のニューラルネットワークとを統合して第3のニューラルネットワークを生成するニューラルネットワーク統合部と、
     分割前の前記訓練データを用いて前記第3のニューラルネットワークによる学習を実行する統合ニューラルネットワーク学習部と
    を備えることを特徴とする学習装置。
    A learning device for performing learning by a neural network,
    A neural network dividing unit for dividing the neural network into a plurality of neural networks including a first and a second neural network;
    A training data dividing unit that divides training data composed of a plurality of samples used for learning of the neural network into first and second training data;
    A divided neural network learning unit that performs learning by the first neural network using the first training data, and that performs learning by the second neural network using the second training data;
    A neural network integration unit that generates a third neural network by integrating the first and second neural networks after successful learning in the divided neural network learning unit;
    A learning apparatus comprising: an integrated neural network learning unit that performs learning by the third neural network using the training data before division.
  2.  請求項1に記載の学習装置であって、
     前記訓練データ分割部は、前記複数のサンプル間のユークリッド距離に基づき、前記訓練データを第1と第2の訓練データとに分割することを特徴とする学習装置。
    The learning device according to claim 1,
    The training data dividing unit divides the training data into first and second training data based on the Euclidean distance between the plurality of samples.
  3.  ニューラルネットワークによる学習を実行する学習装置であって、
     前記ニューラルネットワークを、複数のニューラルネットワークに分割するニューラルネットワーク分割部と、
     前記ニューラルネットワークの学習に用いる複数のサンプルからなる訓練データを、複数の訓練データに分割する訓練データ分割部と、
     前記複数の訓練データの各々に対して、前記複数のニューラルネットワークのなかのいずれかを一意に割り当て、割り当てた前記ニューラルネットワークによる学習を実行する分割ニューラルネットワーク学習部と、
     前記学習を実行した前記複数のニューラルネットワークを統合し、統合ニューラルネットワークを生成するニューラルネットワーク統合部と、
     分割前の前記訓練データを用いて前記統合ニューラルネットワークによる学習を実行する統合ニューラルネットワーク学習部と
    を備えることを特徴とする学習装置。
    A learning device for performing learning by a neural network,
    A neural network dividing unit for dividing the neural network into a plurality of neural networks;
    A training data dividing unit that divides training data composed of a plurality of samples used for learning of the neural network into a plurality of training data;
    A divided neural network learning unit that uniquely assigns any of the plurality of neural networks to each of the plurality of training data, and executes learning by the assigned neural network;
    A neural network integration unit that integrates the plurality of neural networks that have performed the learning and generates an integrated neural network;
    A learning apparatus comprising: an integrated neural network learning unit that performs learning by the integrated neural network using the training data before division.
  4.  請求項3に記載の学習装置であって、
     前記訓練データ分割部は、分割前の前記訓練データを用いた前記複数のニューラルネットワークのいずれかによる学習の実行結果に応じて、前記訓練データを分割することを特徴とする学習装置。
    The learning device according to claim 3,
    The training data dividing unit divides the training data according to a learning execution result by any of the plurality of neural networks using the training data before the division.
  5.  請求項4に記載の学習装置であって、
     前記学習の実行結果とは、分割前の前記訓練データに含まれる出力値と前記統合ニューラルネットワークの演算結果との誤差であることを特徴とする学習装置。
    The learning device according to claim 4,
    The learning execution result is an error between an output value included in the training data before division and a calculation result of the integrated neural network.
  6.  請求項5に記載の学習装置であって、
     前記分割ニューラルネットワーク学習部は、前記複数の訓練データの各々を用いた前記学習の実行結果が所定条件を満たすか否かを判定し、前記所定条件を満たさないと判定した場合、異なる前記分割されたニューラルネットワークによる学習を実行することを特徴とする学習装置。
    The learning device according to claim 5,
    The divided neural network learning unit determines whether an execution result of the learning using each of the plurality of training data satisfies a predetermined condition, and determines that the predetermined condition is not satisfied, the divided neural network learning unit A learning apparatus that performs learning using a neural network.
  7.  請求項6に記載の学習装置であって、
     前記分割ニューラルネットワーク学習部は、前記所定条件を満たすと判定した場合、割り当てた前記ニューラルネットワークの一部による学習を実行することを特徴とする学習装置。
    The learning device according to claim 6,
    The learning apparatus according to claim 1, wherein the divided neural network learning unit performs learning using a part of the assigned neural network when it is determined that the predetermined condition is satisfied.
  8.  請求項3に記載の学習装置であって、
     前記ニューラルネットワークは、畳み込みニューラルネットワークであることを特徴とする学習装置。
    The learning device according to claim 3,
    The learning apparatus according to claim 1, wherein the neural network is a convolutional neural network.
  9.  ニューラルネットワークによる学習を実行する学習装置を用いた学習方法であって、
     前記ニューラルネットワークを、複数のニューラルネットワークに分割し、
     前記ニューラルネットワークの学習に用いる複数のサンプルからなる訓練データを、複数の訓練データに分割し、
     前記複数の訓練データの各々に対して、前記複数のニューラルネットワークのなかのいずれかを一意に割り当て、割り当てた前記ニューラルネットワークによる学習を実行し、
     前記学習を実行した前記複数のニューラルネットワークを統合して、統合ニューラルネットワークを生成し、
     分割前の前記訓練データを用いて前記統合ニューラルネットワークによる学習を実行することを特徴とする学習方法。
    A learning method using a learning device for performing learning by a neural network,
    Dividing the neural network into a plurality of neural networks;
    Dividing training data consisting of a plurality of samples used for learning of the neural network into a plurality of training data;
    For each of the plurality of training data, uniquely assign any one of the plurality of neural networks, to perform learning by the assigned neural network,
    Integrating the plurality of neural networks that have performed the learning to generate an integrated neural network;
    A learning method comprising performing learning by the integrated neural network using the training data before division.
  10.  請求項9に記載の学習方法であって、
     分割前の前記訓練データを用いた前記複数のニューラルネットワークのいずれかによる学習の実行結果に応じて、前記訓練データを分割することを特徴とする学習方法。
    The learning method according to claim 9,
    A learning method, comprising: dividing the training data according to a learning execution result by any of the plurality of neural networks using the training data before division.
  11.  請求項10に記載の学習方法であって、
     前記学習の実行結果とは、分割前の前記訓練データに含まれる出力値と前記統合ニューラルネットワークの演算結果との誤差であることを特徴とする学習方法。
    The learning method according to claim 10, wherein
    The learning execution result is an error between an output value included in the training data before division and a calculation result of the integrated neural network.
  12.  請求項11に記載の学習方法であって、
     前記複数の訓練データの各々を用いた前記学習の実行結果が所定条件を満たすか否かを判定し、前記所定条件を満たさないと判定した場合、異なる前記分割されたニューラルネットワークによる学習を実行することを特徴とする学習方法。
    The learning method according to claim 11,
    It is determined whether or not an execution result of the learning using each of the plurality of training data satisfies a predetermined condition, and when it is determined that the predetermined condition is not satisfied, learning by the different divided neural networks is executed A learning method characterized by that.
  13.  請求項12に記載の学習方法であって、
     前記所定条件を満たすと判定した場合、割り当てた前記ニューラルネットワークの一部による学習を実行することを特徴とする学習方法。
    The learning method according to claim 12,
    If it is determined that the predetermined condition is satisfied, learning is performed using a part of the assigned neural network.
  14.  請求項9に記載の学習方法であって、
     前記ニューラルネットワークは、畳み込みニューラルネットワークであることを特徴とする学習方法。
    The learning method according to claim 9,
    The learning method according to claim 1, wherein the neural network is a convolutional neural network.
PCT/JP2015/065159 2015-05-27 2015-05-27 Neural network learning device and learning method WO2016189675A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2017520142A JP6258560B2 (en) 2015-05-27 2015-05-27 Neural network learning apparatus and learning method
PCT/JP2015/065159 WO2016189675A1 (en) 2015-05-27 2015-05-27 Neural network learning device and learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2015/065159 WO2016189675A1 (en) 2015-05-27 2015-05-27 Neural network learning device and learning method

Publications (1)

Publication Number Publication Date
WO2016189675A1 true WO2016189675A1 (en) 2016-12-01

Family

ID=57392959

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/065159 WO2016189675A1 (en) 2015-05-27 2015-05-27 Neural network learning device and learning method

Country Status (2)

Country Link
JP (1) JP6258560B2 (en)
WO (1) WO2016189675A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020094868A (en) * 2018-12-11 2020-06-18 トヨタ自動車株式会社 Full charge capacity learning device
JP2021507378A (en) * 2017-12-13 2021-02-22 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッドAdvanced Micro Devices Incorporated Simultaneous training of functional subnetworks of neural networks
JP2022515302A (en) * 2019-11-25 2022-02-18 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Methods and equipment for training deep learning models, electronic devices, computer-readable storage media and computer programs
JP2022520912A (en) * 2020-01-22 2022-04-04 深▲チェン▼市商▲湯▼科技有限公司 Data processing methods, devices and chips, electronic devices, storage media
JP7453767B2 (en) 2019-09-25 2024-03-21 キヤノン株式会社 Information processing device, information processing method

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102190100B1 (en) * 2018-12-27 2020-12-11 (주)아크릴 Method for training of an artificial neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07502357A (en) * 1991-12-27 1995-03-09 アール・アンド・ディー・アソシエイツ Fast converging projective neural network
JPH0934862A (en) * 1995-07-19 1997-02-07 Hitachi Ltd Pattern learning method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07502357A (en) * 1991-12-27 1995-03-09 アール・アンド・ディー・アソシエイツ Fast converging projective neural network
JPH0934862A (en) * 1995-07-19 1997-02-07 Hitachi Ltd Pattern learning method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021507378A (en) * 2017-12-13 2021-02-22 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッドAdvanced Micro Devices Incorporated Simultaneous training of functional subnetworks of neural networks
JP7246392B2 (en) 2017-12-13 2023-03-27 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッド Simultaneous Training of Functional Subnetworks of Neural Networks
US11836610B2 (en) 2017-12-13 2023-12-05 Advanced Micro Devices, Inc. Concurrent training of functional subnetworks of a neural network
JP2020094868A (en) * 2018-12-11 2020-06-18 トヨタ自動車株式会社 Full charge capacity learning device
JP7453767B2 (en) 2019-09-25 2024-03-21 キヤノン株式会社 Information processing device, information processing method
JP2022515302A (en) * 2019-11-25 2022-02-18 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Methods and equipment for training deep learning models, electronic devices, computer-readable storage media and computer programs
JP7029554B2 (en) 2019-11-25 2022-03-03 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Methods and equipment for training deep learning models, electronic devices, computer-readable storage media and computer programs
JP2022520912A (en) * 2020-01-22 2022-04-04 深▲チェン▼市商▲湯▼科技有限公司 Data processing methods, devices and chips, electronic devices, storage media

Also Published As

Publication number Publication date
JP6258560B2 (en) 2018-01-10
JPWO2016189675A1 (en) 2017-08-17

Similar Documents

Publication Publication Date Title
JP6258560B2 (en) Neural network learning apparatus and learning method
US11741361B2 (en) Machine learning-based network model building method and apparatus
Mirza et al. Weighted online sequential extreme learning machine for class imbalance learning
CN111542843A (en) Active development with collaboration generators
US8626682B2 (en) Automatic data cleaning for machine learning classifiers
CN109543727B (en) Semi-supervised anomaly detection method based on competitive reconstruction learning
WO2017068675A1 (en) Program generation apparatus, program generation method, and generation program
CN113825978B (en) Method and device for defining path and storage device
US11176672B1 (en) Machine learning method, machine learning device, and machine learning program
WO2019102984A1 (en) Learning device and learning method, identification device and identification method, program, and recording medium
JP7268756B2 (en) Deterioration suppression program, degradation suppression method, and information processing device
JP6641195B2 (en) Optimization method, optimization device, program, and image processing device
WO2020012523A1 (en) Information processing device, information processing method, and information processing program
WO2020168796A1 (en) Data augmentation method based on high-dimensional spatial sampling
WO2022227217A1 (en) Text classification model training method and apparatus, and device and readable storage medium
JP2020123270A (en) Arithmetic unit
Kumar et al. Imbalanced classification in diabetics using ensembled machine learning
KR20170081887A (en) Method of determining the final answer by convolution of the artificial neural networks
JP2010072896A (en) Sv reduction method for multi-class svm
JP4997524B2 (en) Multivariable decision tree construction system, multivariable decision tree construction method, and program for constructing multivariable decision tree
CN115879540A (en) System and method for continuous joint learning based on deep learning
JP7211430B2 (en) Machine learning device, machine learning method, and program
Patel et al. Machine learning based structure recognition in analog schematics for constraints generation
KR20210057847A (en) Image Recognition Method through Deep Learning of Multidimensional Neural Network and Decision Neural Network And System Thereof
US20200387792A1 (en) Learning device and learning method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15893309

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017520142

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15893309

Country of ref document: EP

Kind code of ref document: A1