WO2020261509A1 - Machine learning device, machine learning program, and machine learning method - Google Patents

Machine learning device, machine learning program, and machine learning method Download PDF

Info

Publication number
WO2020261509A1
WO2020261509A1 PCT/JP2019/025711 JP2019025711W WO2020261509A1 WO 2020261509 A1 WO2020261509 A1 WO 2020261509A1 JP 2019025711 W JP2019025711 W JP 2019025711W WO 2020261509 A1 WO2020261509 A1 WO 2020261509A1
Authority
WO
WIPO (PCT)
Prior art keywords
output
weight
machine learning
input
vector
Prior art date
Application number
PCT/JP2019/025711
Other languages
French (fr)
Japanese (ja)
Inventor
一紀 中田
Original Assignee
Tdk株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tdk株式会社 filed Critical Tdk株式会社
Priority to PCT/JP2019/025711 priority Critical patent/WO2020261509A1/en
Priority to PCT/JP2020/025150 priority patent/WO2020262587A1/en
Publication of WO2020261509A1 publication Critical patent/WO2020261509A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present invention relates to a machine learning device, a machine learning program, and a machine learning method.
  • Non-Patent Document 1 a method using the extended Kalman filter method is known as a method of updating weights.
  • the neural network that updates the weights using the extended Kalman filter method may cause numerical instability due to quantization error in the matrix calculation for calculating the Kalman gain matrix.
  • the occurrence of such numerical instability can be suppressed by increasing the number of quantization bits.
  • One aspect of the present invention is machine learning that performs machine learning of input data of one dimension or more arranged in a predetermined order by using a recursive neural network having a plurality of nodes connected to each other by weighted edges.
  • a device, the recursive neural network comprising an input layer having one or more input nodes, an intermediate layer having one or more intermediate nodes, and an output layer having one or more output nodes.
  • the input node, the intermediate node, and the output node are different nodes from the plurality of nodes, and the weight assigned to each edge connecting the intermediate nodes has a predetermined size.
  • the machine learning device performs output data generation processing and weight update processing each time the input layer receives input data of one dimension or more in a predetermined order, and generates the output data.
  • the processing includes a first process of outputting the input data received by the input layer from the input layer to the intermediate layer, and one or more dimensions corresponding to the input data input to the intermediate layer by the first process.
  • a second process of outputting the intermediate data from the intermediate layer to the output layer, and a one-dimensional or higher output data corresponding to the one-dimensional or higher intermediate data input to the output layer by the second process are generated.
  • the third process is a process of performing the first process, the second process, and the third process in this order, and the weight update process is assigned to each edge connecting the intermediate node and the output node.
  • a vector having an estimated value for weight as a component is used as an estimated weight vector
  • a vector having a predicted value for output data of one dimension or more as a component is used as a predicted output vector
  • two or more estimated weight vectors having different components from each other are used as a predicted output vector
  • the Kalman gain matrix in the ensemble Kalman filter method is calculated based on the predicted output vector calculated for each of the two or more estimated weight vectors, and the intermediate node and the output are based on the calculated Kalman gain matrix. It is a machine learning device that updates the weight assigned to each edge that connects to a node.
  • the temporal change of the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the plotted graph.
  • the temporal change of the output data output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the graph which plotted.
  • FIG. 1 is a diagram showing an example of the configuration of the machine learning device 1 according to the embodiment.
  • the machine learning device 1 performs machine learning of p-dimensional input data.
  • p may be any integer as long as it is an integer of 1 or more.
  • the machine learning device 1 performs such machine learning by using a recurrent neural network having a plurality of nodes.
  • the plurality of nodes are connected to each other by weighted edges.
  • the p-dimensional input data is data that correlates with each other. Further, the p-dimensional input data is data arranged in a predetermined order. In the following, as an example, a case where the predetermined order is in chronological order will be described.
  • the p-dimensional input data is p-dimensional time series data.
  • the p-dimensional time series data is, for example, data acquired from p sensors in chronological order.
  • the p-sensors may be p-type sensors, or some or all of them may be p-sensors of the same type.
  • the predetermined order may be another order such as a spatially arranged order instead of the time series order.
  • the time indicating the time series order is indicated by the discretized time k.
  • k is an integer.
  • k may be another number such as a real number.
  • the recurrent neural network according to the embodiment has at least an input layer L1, an intermediate layer L2, and an output layer L3.
  • FIG. 2 is a diagram showing an example of the configuration of the recurrent neural network according to the embodiment.
  • the recurrent neural network according to the embodiment will be referred to as an ensemble FORCE learner.
  • each node means the operation itself between the data flowing in the neural network. Therefore, each node means a function that performs the operation in the neural network realized by software. In addition, each node means an element that performs the calculation in the neural network realized by hardware.
  • an edge connecting between a certain node N1 and another node N2 indicates a data flow from the node N1 to the node N2.
  • the data flowing from node N1 to node N2 is multiplied by the weight assigned to the edge connecting between node N1 and node N2. That is, the data after the weight is multiplied by passing through the edge is input to the node N2 from the edge. Therefore, in the neural network realized by software, the edge means a function that performs such weight multiplication. Further, the edge means an element that performs such weight multiplication in the neural network realized by hardware.
  • the input layer L1 has an input node.
  • the input layer L1 may have the same number of input nodes as the number of dimensions of the P-dimensional input data, or may have a number of input nodes different from the number of dimensions of the P-dimensional input data.
  • the number of these input nodes may be less than P and more than P. It may be.
  • a weighted linear sum of P-dimensional input data is input to these input nodes.
  • a certain input node accepts the input data associated with the input node among the input data.
  • the p-th input node among the P input nodes receives the p-th input data among the P-dimensional time series data.
  • p is any integer of 1 or more and P or less. That is, p is a number (label) that identifies each of the P input nodes with each other, and is also a number (label) that identifies each of the P input data with each other.
  • the input layer L1 outputs each of the P-dimensional input data received by the P input nodes to the intermediate layer L2.
  • the intermediate layer L2 has a plurality of intermediate nodes. Further, the intermediate layer L2 receives each of the P-dimensional input data output by the input layer L1. More specifically, the intermediate layer L2 receives each of the P-dimensional input data output by the input layer L1 by a part or all of the plurality of intermediate nodes. The intermediate layer outputs Q-dimensional intermediate data corresponding to the received P-dimensional input data to the output layer L3. Q may be any number as long as it is an integer of 1 or more. Therefore, the intermediate layer L2 has at least Q intermediate nodes that output each of the Q-dimensional intermediate data to the output layer L3. Here, the qth intermediate node among these Q intermediate nodes outputs the qth intermediate data among the Q-dimensional intermediate data to the output layer L3.
  • q is any integer of 1 or more and Q or less.
  • q is a number (label) that identifies each of the Q intermediate nodes from each other, and is also a number (label) that identifies each of the Q-dimensional intermediate data from each other.
  • a certain intermediate node when one or more input data is received, a certain intermediate node generates an output value obtained when the sum of the received one or more input data is input to the first activation function.
  • the first activation function may be any function as long as it is a non-linear function.
  • the intermediate node outputs the generated output value to another node connected to the intermediate node by the edge.
  • the intermediate node is one of the Q intermediate nodes described above
  • the generated output value is output to the output layer L3 as intermediate data.
  • the individual intermediate nodes of the intermediate layer L2 generate such an output value.
  • other processes such as bias addition will not be described.
  • the intermediate layer L2 is, for example, a reservoir in reservoir computing. Therefore, the weight in the intermediate layer L2 is determined in advance by a random number. Then, the weight is not updated in the intermediate layer L2. In other words, the weight assigned to the edge connecting the intermediate nodes is fixed to a predetermined size (that is, a size determined by a random number). In addition, the intermediate layer L2 may be another intermediate layer in which the weight is not updated in the layer instead of the reservoir.
  • the output layer L3 has R output nodes.
  • R may be any integer as long as it is an integer of 1 or more.
  • the output layer L3 receives Q-dimensional intermediate data from the intermediate layer L2 by these R output nodes.
  • the output layer L3 generates and outputs R-dimensional output data corresponding to the received Q-dimensional intermediate data. That is, the r-th output node among the R output nodes generates the r-th output data among the R-dimensional output data.
  • r is any integer of 1 or more and R or less.
  • r is a number (label) that identifies each of the R output nodes from each other, and is also a number (label) that identifies each of the R-dimensional output data from each other.
  • a certain output node when one or more intermediate data is received, a certain output node generates an output value obtained when the sum of the received one or more intermediate data is input to the second activation function.
  • the second activation function will be described later.
  • the output node outputs the output value as output data.
  • the output node of the output layer L3 generates such an output value.
  • other processes such as bias addition and output of the output value will not be described.
  • the ensemble FORCE learner has an intermediate layer L2 which is a reservoir in this example. Therefore, the ensemble FORCE learner is a kind of reservoir computing in this example.
  • the input node, the intermediate node, and the output node are different nodes among the plurality of nodes of the ensemble FORCE learner, and do not overlap with each other.
  • the data D1 from the input node X11 to the intermediate node X12 is output, the data D1 is multiplied by the weight assigned to the edge connecting the input node X11 and the intermediate node X12. Then, the data D1 after the weight is multiplied is input to the intermediate node X12.
  • the data D2 when some data D2 is output from a certain intermediate node X21 to another intermediate node X22, the data D2 is multiplied by the weight assigned to the edge connecting the intermediate node X21 and the intermediate node X22. Then, the data D2 after the weight is multiplied is input to the intermediate node X22.
  • the data D3 from the intermediate node X31 to the output node X32 is output, the data D3 is multiplied by the weight assigned to the edge connecting the intermediate node X31 and the output node X32. Then, the data D3 after the weight is multiplied is input to the output node X32.
  • the weights in the intermediate layer L2 are not updated, in the ensemble FORCE learning, the weights are updated for the weights assigned to each edge connecting the intermediate node and the output node. Also, the weights are not updated for the weights assigned to the edges that connect the input node and the intermediate node. Therefore, in the following, for convenience of explanation, the weights assigned to the edges connecting the intermediate node and the output node will be collectively referred to as update target weights.
  • the number of weights to be updated is represented by L. L may be any number as long as it is an integer of 2 or more.
  • each " ⁇ ” shown in FIG. 2 indicates a node. That is, each " ⁇ " included in the input layer L1 indicates an input node. Further, each " ⁇ ” included in the intermediate layer L2 indicates an intermediate node. Further, “ ⁇ ” included in the output layer L3 indicates an output node, respectively.
  • the arrows connecting the nodes shown in FIG. 2 are drawn to clearly represent the image of the connection mode by the edges between the nodes in the ensemble FORCE learner, and the actual ensemble FORCE learner. It is different from the connection mode by the edge between each node in.
  • the input of the input data to the input layer L1 and the output of the output data from the output layer L3 may be performed by a known method or may be performed by a method to be developed in the future. Is omitted.
  • the machine learning device 1 uses such ensemble FORCE learning to perform machine learning of the above-mentioned P-dimensional input data. More specifically, the machine learning device 1 performs weight update processing every time the input layer L1 receives P-dimensional input data in chronological order (that is, every time the input data is received in a predetermined order). Performs output data generation processing.
  • the weight update process is a process for updating the update target weight.
  • the machine learning device 1 performs a weight update process before performing the output data generation process. That is, each time the input layer L1 receives P-dimensional input data in chronological order, the machine learning device 1 updates the update target weight and then performs the output data generation process.
  • the weight update process is a process for updating the update target weight based on the ensemble Kalman filter method.
  • the weight update process is a process for updating the update target weight based on the ensemble Kalman filter method.
  • it is necessary to prepare the same number of intermediate layers of the neural network for updating as the number of samples (number of particles) in the ensemble Kalman filter method. Therefore, in the past, there has been a problem that the calculation cost increases in this case.
  • the weights in the intermediate layer L2 are not updated. Therefore, in the ensemble FORCE learning, the weight can be updated based on the ensemble Kalman filter method while keeping the number of the intermediate layers L2 at one. As a result, the machine learning device 1 can suppress an increase in calculation cost.
  • the weight update based on the ensemble Kalman filter method has a lower inverse matrix calculation cost than the weight update by another method involving matrix calculation for calculating the Kalman gain matrix.
  • the machine learning device 1 can suppress the occurrence of numerical instability due to the quantization error without increasing the number of quantization bits.
  • the weight update process obtains a Kalman gain matrix in the ensemble Kalman filter method based on M estimated weight vectors having different components and predicted output vectors calculated for each of M estimated weight vectors. This is a process of calculating and updating the update target weight based on the calculated Kalman gain matrix.
  • M is the number of samples in the ensemble Kalman filter method. M may be any number as long as it is an integer of 2 or more.
  • the estimated weight vector is a vector having an estimated value for each weight included in the update target weight as a component.
  • the estimated value will be referred to as an estimated weight. Since the number of weights to be updated is L (that is, the number of estimated weights is L), the estimated weight vector is an L-dimensional vector. The initial value of the estimated weight is randomly determined by a random number.
  • the predicted output vector is a vector having predicted values for each of the R-dimensional output data as components. That is, the predicted output vector is an R-dimensional vector.
  • the predicted output vector at a certain time k is calculated based on the estimated weight vector at the time k.
  • the output data generation process is a process performed after the update target weight is updated by the weight update process, and then the update target weight is used.
  • the output data generation process is a process in which the first process, the second process, and the third process are performed in the order of the first process, the second process, and the third process.
  • the first process is a process of outputting the P-dimensional input data received by the input layer L1 from the input layer L1 to the intermediate layer L2.
  • the second process is a process of outputting Q-dimensional intermediate data corresponding to the p-dimensional input data input to the intermediate layer L2 by the first process from the intermediate layer L2 to the output layer L3.
  • the third process is a process of generating R-dimensional output data corresponding to Q-dimensional or higher intermediate data input to the output layer L3 by the second process.
  • the output data generation process is the same as the process for generating output data in Reservoir Computing. Therefore, further detailed description of the output data generation process will be omitted.
  • the machine learning device 1 includes an arithmetic unit 11, a memory 12, and a network interface 13.
  • the machine learning device 1 may be configured to include other circuits and other devices.
  • the machine learning device 1 may be configured to include an input device such as a keyboard and a mouse.
  • the machine learning device 1 may be configured to include an output device such as a display.
  • the machine learning device 1 may be configured to include an interface for connecting at least one of the input device and the output device.
  • the arithmetic unit 11 is a processor, for example, an FPGA (Field Programmable Gate Array).
  • the arithmetic unit 11 may be a CPU (Central Processing Unit) instead of the FPGA, may be a combination of the FPGA and the CPU, or may be another processor.
  • CPU Central Processing Unit
  • the arithmetic unit 11 is an FPGA. Therefore, the arithmetic unit 11 realizes the above-mentioned ensemble FORCE learning by the hardware (for example, an integrated circuit or the like) possessed by the FPGA, and performs machine learning on the p-dimensional input data.
  • the arithmetic unit 11 is a CPU
  • the arithmetic unit 11 may be configured to perform the machine learning by combining the hardware possessed by the CPU and the software executed by the CPU.
  • the arithmetic unit 11 may be configured by near memory, memory logic, or the like, as will be described later. In other words, the arithmetic unit 11 may be composed of hardware including at least one of near memory and memory logic.
  • the memory 12 stores, for example, various information used by the arithmetic unit 11.
  • the memory 12 includes, for example, SSD (Solid State Drive), HDD (Hard Disk Drive), EEPROM (Electrically Erasable Programmable Read-Only Memory), ROM (Read-Only Memory), RAM (Random Access Memory), and the like.
  • the memory 12 may be an external storage device connected by a digital input / output port such as USB, instead of the one built in the arithmetic unit 11.
  • the network interface 13 is an interface that connects to an external device such as a sensor via a network.
  • the weight update process performed by the machine learning device 1 will be described.
  • the weight update process described below is a process based on the ensemble Kalman filter method.
  • sequential calculation is performed according to the time series order indicated by the discretized time k. Therefore, the time k that appears as an argument of the function, vector, matrix, etc. described below indicates the time series order in such sequential calculation.
  • the following formulation by the ensemble Kalman filter method is only an example, and may be another formulation.
  • the ensemble FORCE learner can be represented by the nonlinear vector functions shown in the following equations (1) and (2).
  • the vector x in the above equation (1) represents a weight vector.
  • the weight vector is a vector having an update target weight as a component. That is, the vector x k + 1 indicates a weight vector at time k + 1. Further, x k indicates a weight vector at time k.
  • the vector ⁇ is a weight error vector indicating a modeling error for the weight vector. That is, the vector ⁇ k indicates a weight error vector at time k.
  • the vector ⁇ k is obtained by assuming some error distribution as the error distribution of the modeling error for the weight vector at time k. As the error distribution, for example, a Gaussian distribution or the like can be adopted.
  • the first term on the right side in the equation (1) may be a non-linear function in which the vector x k and the time k are variables.
  • the vector y in the above equation (2) indicates an output vector.
  • the output vector is a vector having R-dimensional output data as a component. That is, the vector y k indicates the output vector at time k.
  • the vector ⁇ is an output error vector indicating a modeling error for the output vector. That is, the vector ⁇ k indicates the output error vector at time k.
  • the vector ⁇ k is obtained by assuming some error distribution as the error distribution of the modeling error for the output error vector at time k. As the error distribution, for example, a Gaussian distribution or the like can be adopted.
  • the function h is the above-mentioned second activation function. More specifically, the function h is a two-variable function, such as a sigmoid function, a bicurve tangent function, a linear function, and Relu.
  • M weight vectors are treated as M samples.
  • a model representing the time evolution for each of these M weight vectors should be represented by the above equation (1). Therefore, in the following, the model representing the time evolution for each of the M weight vectors is represented by the M equations shown in the following equation (3).
  • j is a subscript for identifying each of the M formulas. That is, j is an integer of 1 or more and M or less. Therefore, j is also a subscript that identifies each of the M weight vectors and is also a subscript that identifies each of the M weight error vectors.
  • the model representing the time evolution for each of the M output data should be represented by the above equation (2). Therefore, in the following, the model representing the time evolution for each of the M output data is represented by the M equations shown in the following equation (4).
  • j is a subscript for identifying each of the M equations. That is, j is also a subscript that identifies each of the M output vectors, and is also a subscript that identifies each of the M output error vectors.
  • the above equation (3) is expressed as the following equation (5) with the first term on the right side of the equation (3) as the estimated weight vector and the left side of the equation (3) as the predicted weight vector. It will be fixed.
  • the first term on the right side in the above equation (5) indicates an estimated weight vector. Further, the left side in the equation (5) indicates a predicted weight vector.
  • the estimated weight vector associated with the time k-1 is required. Therefore, the estimated weight vector needs a vector as an initial value. For example, a value of 0 or more and 1 or less may be randomly assigned to each component of the vector as the initial value, or another value may be assigned by another method.
  • the first term on the right side of the equation (4) is used as the estimated output vector, and the left side of the equation (4) is used as the predicted output vector, as in the following equation (6). It is re-expressed as.
  • the first term on the right side in the above equation (6) indicates an estimated output vector. That is, in the ensemble Kalman filter method, the estimated output vector is represented by the second activation function having the predicted weight vector and the time as variables. The left side in Eq. (6) indicates the predicted output vector.
  • the error ensemble vector for the estimated weight vector calculated based on the above equation (5) is expressed as the following equations (7) and (8).
  • the error ensemble vector will be referred to as a weight error ensemble vector.
  • the left side of the above equation (7) shows the weight error ensemble vector.
  • the right side of the equation (7) shows each component of the weight error ensemble vector.
  • the weight error ensemble vector is defined as a horizontal vector. That is, the transposed matrix of the weight error ensemble vector is a vertical vector.
  • each component of the weight error ensemble vector is calculated by the equation (8). That is, each component of the weight error ensemble vector is the difference between each estimated weight vector and the average of M estimated weight vectors.
  • the error ensemble vector for the estimated output vector calculated based on the above equation (6) is expressed as the following equations (9) and (10).
  • the error ensemble vector will be referred to as an output error ensemble vector.
  • the left side of the above equation (9) shows the output error ensemble vector.
  • the right side of the equation (9) shows each component of the output error ensemble vector.
  • the output error ensemble vector is defined as a horizontal vector. That is, the transposed matrix of the output error ensemble vector is a vertical vector.
  • each component of the output error ensemble vector is calculated by the equation (10). That is, each component of the output error ensemble vector is the difference between each estimated output vector and the average of M estimated output vectors.
  • the covariance matrix shown in the above equation (11) will be referred to as a first covariance matrix.
  • the first covariance matrix is a matrix of L rows and R columns because the number of weights to be updated is L and the dimension of the output data is R.
  • the second covariance matrix is a matrix of R rows and R columns because the dimension of the output data is R.
  • the Kalman gain matrix is based on the first covariance matrix calculated based on the above equation (11) and the second covariance matrix calculated based on the equation (12) as follows. It is expressed as the equation (13).
  • the left side of the above equation (13) shows the Kalman gain matrix.
  • the first covariance matrix is a matrix of L rows and R columns
  • the second covariance matrix is a matrix of R rows and R columns. Therefore, the Kalman gain matrix is a matrix of L rows and R columns.
  • the estimated weight vector can be calculated by correcting the predicted weight vector as shown in the following equation (14) based on the Kalman gain matrix based on the above equation (13).
  • the first term in parentheses in the second term on the right side of the above equation (14) indicates the teacher data for the output data.
  • the updated Kalman filter method calculates the updated weight to be updated based on the following equation (15).
  • the left side of the above equation (15) indicates the update target weight after the above update. That is, the update target weight after update is the average of the estimated weight vectors.
  • the machine learning device 1 calculates the updated update target weight by the above equation (15), and then performs the above-mentioned output data generation process using the calculated update target weight. Then, when the input data is next received by the input layer L1, the machine learning device 1 uses M estimated weight vectors calculated based on the above equation (14) as inputs to the above equation (5). By using it, the next weight update process is started. As described above, the machine learning device 1 performs the weight update process and the output data generation process each time the input data is received by the input layer L1.
  • the second covariance matrix is a 1-by-1 matrix, that is, a scalar.
  • the Kalman gain matrix becomes an L-by-1 matrix, that is, an L-dimensional vector. From this, the machine learning device 1 that performs machine learning by ensemble FORCE learning can significantly reduce the calculation cost for calculating the Kalman gain matrix by setting the number of output nodes to 1.
  • the machine learning device 1 can suppress the occurrence of numerical instability due to the quantization error.
  • updating weights by the ensemble Kalman filter method in a neural network requires preparing the same number of intermediate layers as the number of samples. Therefore, using the weight update by the ensemble Kalman filter method in the neural network is not preferable from the viewpoint of reducing the calculation cost.
  • a neural network having an intermediate layer (intermediate layer L2 in this example) in which weights are not updated in a layer such as a reservoir as in ensemble FORCE learning, it is sufficient to prepare one intermediate layer. .. Therefore, the ensemble FORCE learning can suppress the increase in calculation cost and the occurrence of numerical instability due to the quantization error.
  • the ensemble FORCE learner can be said to be a neural network that can achieve both the merits of adopting the reservoir computing and the merits of updating the weights by the ensemble Kalman filter method.
  • FIG. 3 is a diagram showing an example of the flow of the weight update process performed by the machine learning device 1.
  • the machine learning device 1 processes the flowchart shown in FIG. 3 every time the input layer L1 receives P-dimensional input data in chronological order.
  • the machine learning device 1 receives the first P-dimensional input data in the time series order at the timing before the processing of step S110 shown in FIG. 3 is performed will be described.
  • the machine learning device 1 specifies the initial values of each of the M estimated weight vectors (step S110).
  • the machine learning device 1 may have a configuration in which the initial value is calculated by a random number, a configuration given by the user, or a configuration specified by another method.
  • the machine learning device 1 calculates M predicted weight vectors based on the M initial values specified in step S110, the above equation (5), and the M weight error vectors (). Step S120).
  • the machine learning device 1 may have a configuration in which M weight error vectors are calculated by random numbers, a configuration given by a user, or a configuration specified by another method.
  • the machine learning device 1 calculates M predicted output vectors based on the M predicted weight vectors calculated in step S120, the above equation (6), and M output error vectors. (Step S130).
  • the machine learning device 1 may have a configuration in which M output error vectors are calculated by random numbers, a configuration given by the user, or a configuration specified by another method.
  • the machine learning device 1 calculates two error ensemble vectors based on the M predicted weight vectors calculated in step S120 and the M predicted output vectors calculated in step S130 (step S140). .. More specifically, the machine learning device 1 calculates the weight error ensemble vector based on the M predicted weight vectors calculated in step S120 and the above equations (7) and (8). Further, the machine learning device 1 calculates an output error ensemble vector based on the M predicted output vectors calculated in step S130 and the above equations (9) and (10).
  • the machine learning device 1 calculates two covariance matrices based on the two error ensemble vectors calculated in step S140 (step S150). More specifically, the machine learning device 1 has a first covariance matrix based on the weight error ensemble vector calculated in step S140, the output error ensemble vector calculated in step S140, and the above equation (11). Is calculated. Further, the machine learning device 1 calculates the second covariance matrix based on the output error ensemble vector calculated in step S140 and the above equation (12).
  • the machine learning device 1 calculates the Kalman gain matrix based on the first covariance matrix calculated in step S150, the second covariance matrix calculated in step S150, and the above equation (13). (Step S160).
  • the machine learning device 1 includes the M predicted weight vectors calculated in step S120, the M predicted output vectors calculated in step S130, the teacher data, the Kalman gain matrix calculated in step S160, and the above.
  • the update target weight after the update is calculated based on the equation (14) and the equation (15) of (step S180).
  • the machine learning device 1 waits until the next input data is received by the input layer L1 (step S190).
  • step S190-YES the machine learning device 1 transitions to step S120 and converts the M estimated weight vectors calculated in step S170 executed immediately before. Based on this, M predicted weight vectors are calculated.
  • the machine learning device 1 performs the weight update process based on the ensemble Kalman filter method. As a result, the machine learning device 1 can significantly reduce the calculation cost for calculating the Kalman gain matrix. As a result, the machine learning device 1 can suppress the occurrence of numerical instability due to the quantization error. Further, the machine learning device 1 can perform online learning by the ensemble FORCE learning shown in FIG. 2 by the weight updating process. As a result, the machine learning device 1 can be mounted on the edge device as, for example, a device that performs machine learning by the ensemble FORCE learning. When considering mounting the ensemble FORCE learning on an edge device or the like, it is important to improve the efficiency of the weight update process.
  • the ensemble FORCE learner can be implemented in an edge device or the like as hardware including at least one of near memory and memory logic.
  • the memory access speed, calculation speed, and the like of the ensemble FORCE learner mounted on the edge device or the like as the hardware differ depending on the design of the data flow in the weight update process based on the ensemble Kalman filter method.
  • the ensemble FORCE learner is implemented in an edge device or the like as hardware including at least one of near memory and memory logic, it is necessary to consider an efficient data flow.
  • FIG. 4 is a diagram showing an example of the overall configuration of the data flow in the weight update process based on the ensemble Kalman filter method. As shown in FIG. 4, the data flow in the weight update process is roughly divided into three blocks, block B1, block B2, and block B3. Note that each of these three blocks indicates hardware that includes at least one of near memory and memory logic. In FIG. 4, the time series order in the data flow is shown by the time k.
  • Block B1 is a block for calculating M predicted weight vectors.
  • Block B1 includes a block associated with each of the M estimated weight vectors. More specifically, the block B1 includes the block B1-j as a block in which the j-th estimated weight vector among the M estimated weight vectors is input. That is, block B1 includes M blocks of blocks B1-1 to B1-M.
  • the block B1-j contains the j-th estimated weight vector among the M estimated weight vectors and the j-th weight error vector among the M weight error vectors. Entered. Then, the block B1-j calculates the j-th predicted weight vector among the M predicted weight vectors based on the estimated weight vector, the weight error vector, and the above equation (5). Block B1-j outputs the calculated predicted weight vector to block B2.
  • Block B2 is a block for calculating M estimated weight vectors.
  • Block B2 includes a block associated with each of the M predicted weight vectors. More specifically, the block B2 includes the block B2-j as a block in which the j-th predicted weight vector among the M predicted weight vectors is input. That is, block B2 includes M blocks of blocks B2-1 to B2-M.
  • the j-th predicted weight vector among the M predicted weight vectors and the j-th difference vector among the M difference vectors will be described later.
  • the Kalman gain matrix output from block B3 is input.
  • the j-th difference vector among the M difference vectors is calculated by the following equation (16).
  • the left side of the above equation (16) indicates the j-th difference vector among the M difference vectors. That is, the j-th difference vector among the M difference vectors is a vector in which the inside of the parentheses of the second term on the right side of the equation (14) is newly redefined as one vector.
  • Block B2-j is based on the j-th predicted weight vector among the M predicted weight vectors, the j-th difference vector among the M difference vectors, the Kalman gain matrix, and the above equation (14). Then, the j-th estimated weight vector out of the M estimated weight vectors is calculated. Block B2-j outputs the calculated estimated weight vector. As a result, the machine learning device 1 is updated based on the M estimated weight vectors calculated in the block B2 by other blocks (not shown in FIG. 4) and the above equation (15). Can be calculated.
  • Block B3 is a block for calculating the Kalman gain matrix. M predictive weight vectors output from the block B1 are input to the block B3. Then, the block B3 calculates the Kalman gain matrix based on the M predicted weight vectors. Block B3 outputs the calculated Kalman gain matrix. At this time, the block B3 also outputs the Kalman gain matrix to the block B2.
  • FIG. 5 is a diagram showing the simplest concrete example of the data flow inside the block B3.
  • the data flow shown in FIG. 3 is a data flow that holds regardless of the second activation function adopted in the ensemble FORCE learning, which is a non-linear function.
  • the data flow shown in FIG. 5 is roughly divided into five blocks, blocks B31 to B35. It should be noted that each of the five blocks indicates hardware including at least one of near memory and memory logic.
  • the time series order in the data flow is shown by the time k.
  • Block B31 is a block for calculating the first term on the right side of the above equation (8) and the first term on the right side of the above equation (10).
  • the first term on the right side of the above equation (8) is, that is, the average of M predicted weight vectors.
  • the first term on the right side of the above equation (10) is, that is, the average of M predicted output vectors. That is, M prediction weight vectors and M prediction output vectors are input to the block B31.
  • the block B31 calculates the average of M prediction weight vectors and the average of M prediction output vectors.
  • the block B31 outputs the calculated average of the M predicted weight vectors and the calculated average of the M predicted output vectors to the block B32. At this time, the block B31 further outputs the input M prediction weight vectors and the input M prediction output vectors to the block B32.
  • Block B32 is a block for calculating each component of the weight error ensemble vector and each component of the output error ensemble vector. That is, the average of the M predicted weight vectors output from the block B31 and the average of the M predicted output vectors output from the block B31 are input to the block B32. Further, in the block B32, M predictive weight vectors output from the block B31 and M predictive output vectors output from the block B31 are input. Then, the block B32 calculates each component of the weight error ensemble vector based on the M predicted weight vectors and the average of the M predicted weight vectors. Further, the block B32 calculates each component of the output error ensemble vector based on the average of the M predicted output vectors and the M predicted output vectors. The block B32 outputs each component of the calculated weight error ensemble vector and each component of the calculated output error ensemble vector to the block B33.
  • Block B33 is a block that generates a weight error ensemble vector and an output error ensemble vector. That is, each component of the weight error ensemble vector output from the block B32 and each component of the output error ensemble vector output from the block B32 are input to the block B33. Then, the block B33 generates a weight error ensemble vector based on each component of the weight error ensemble vector output from the block B32. Further, the block B32 generates an output error ensemble vector based on each component of the output error ensemble vector. The block B33 outputs the generated weight error ensemble vector and the generated output error ensemble vector to the block B34.
  • Block B34 is a block for calculating the first covariance matrix and the second covariance matrix.
  • the block B35 is a block that performs the calculation of the above equations (11) and (12). That is, the weight error ensemble vector output from the block B33 and the output error ensemble vector output from the block B33 are input to the block B34. Then, the block B34 calculates the first covariance matrix based on the weight error ensemble vector and the output error ensemble vector. Further, the block B34 calculates the second covariance matrix based on the output error ensemble vector. The block B34 outputs the calculated first covariance matrix and the calculated second covariance matrix to the block B35.
  • Block B35 is a block for calculating the Kalman gain matrix.
  • the block B35 is a block that performs the calculation of the above equation (13). That is, the first covariance matrix output from the block B34 and the second covariance matrix output from the block B34 are input to the block B35. Then, the block B35 calculates the Kalman gain matrix based on the first covariance matrix and the second covariance matrix. Block B35 outputs the calculated Kalman gain matrix.
  • the machine learning device 1 can implement ensemble FORCE learning on an edge device or the like as hardware including at least one of near memory and memory logic. As a result, the machine learning device 1 can increase the speed of memory access, the calculation speed, and the like without using a special function as the second activation function.
  • FIG. 6 shows a double pendulum composed of a first weight having a mass m1 connected by a rod having a length l1 from the origin and a second weight having a mass m2 connected to the weight by a rod having a length l2. It is a figure which shows an example.
  • the temporal changes in the displacements of the first weight and the second weight in the X-axis direction and the Y-axis direction in the double pendulum shown in FIG. 6 are deterministically described by the equation of motion.
  • the direction in which gravity acts is the direction indicated by the arrow g.
  • the equation of motion for the double pendulum shown in FIG. 6 is written down for each of the first weight and the second weight.
  • the force in the equation of motion written for each of the first weight and the second weight is between the angle ⁇ 1 between the Y-axis and the rod l1 shown in FIG. 6 and the Y-axis and the rod l2. It is shown by a function of four parameters: the angle ⁇ 2 of, the angular velocity which is the change of the angle ⁇ 1 per unit time, and the angular velocity which is the change of the angle ⁇ 2 per unit time.
  • FIG. 7 shows the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the graph which plotted the temporal change.
  • the vertical axis of the graph shown in FIG. 7 shows the displacement of the second weight in the X-axis direction.
  • the horizontal axis of the graph shows the elapsed time.
  • the said period is shown as a period of elapsed time 0 to elapsed time 400.
  • the plot PLT1 in the graph shown in FIG. 7 is a plot of teacher data. Further, the plot PLT2 in the graph is a plot of output data. As shown in FIG. 7, the degree of agreement between the output data output from the machine learning device 1 during online learning and the teacher data is not so high. This is because the machine learning device 1 is learning online.
  • FIG. 8 is output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the graph which plotted the temporal change of output data.
  • the vertical axis of the graph shown in FIG. 8 shows the displacement of the second weight in the X-axis direction.
  • the horizontal axis of the graph shows the elapsed time.
  • the said period is shown as a period of elapsed time 400 to elapsed time 800.
  • the plot PLT1 in the graph shown in FIG. 8 is a plot of teacher data. Further, the plot PLT3 in the graph is a plot of output data. As shown in FIG. 8, the degree of agreement between the output data output from the machine learning device 1 after the online learning and the teacher data is higher than that before the online learning.
  • the accuracy of the results of online learning performed by the machine learning device 1 varies depending on the number of intermediate nodes and the number of samples.
  • FIG. 9 shows the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows another example of the graph which plotted the temporal change.
  • the vertical axis of the graph shown in FIG. 9 shows the displacement of the second weight in the X-axis direction.
  • the horizontal axis of the graph shows the elapsed time.
  • the said period is shown as a period of elapsed time 0 to elapsed time 400.
  • the plot PLT1 in the graph shown in FIG. 9 is a plot of teacher data. Further, the plot PLT4 in the graph is a plot of output data. As shown in FIG. 9, the degree of agreement between the output data output from the machine learning device 1 during online learning and the teacher data is not so high. This is because the machine learning device 1 is learning online.
  • FIG. 10 is output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG.
  • the vertical axis of the graph shown in FIG. 10 shows the displacement of the second weight in the X-axis direction.
  • the horizontal axis of the graph shows the elapsed time.
  • the said period is shown as a period of elapsed time 400 to elapsed time 800.
  • the plot PLT1 in the graph shown in FIG. 10 is a plot of teacher data. Further, the plot PLT5 in the graph is a plot of output data. As shown in FIG. 10, the degree of agreement between the output data output from the machine learning device 1 after the online learning and the teacher data is higher than that before the online learning. Further, as shown in FIG. 10, the degree of coincidence between the output data output from the machine learning device 1 during online learning in the example shown in FIG. 10 and the teacher data is during online learning in the example shown in FIG. Compared with the output data output from the machine learning device 1 and the teacher data, there is not much change. This is because even if the number of intermediate nodes in the example shown in FIG. 10 is half the number of intermediate nodes in the example shown in FIG. 7, the accuracy of online learning performed by the machine learning device 1 is high. Means that.
  • the machine learning device 1 can improve the accuracy of online learning while reducing the number of intermediate nodes by the ensemble FORCE learning and the weight update process by the ensemble Kalman filter method. As a result, the machine learning device 1 can achieve both a reduction in manufacturing cost and an improvement in machine learning accuracy.
  • FIGS. 11 and 12 are the same as the graphs shown in FIGS. 7 and 8 when the number of intermediate nodes is 250 and the number of samples in the ensemble Kalman filter method is 20. This is an example of the result when the graph is drawn on the machine learning device 1.
  • FIG. 11 shows the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows still another example of the graph which plotted the temporal change.
  • the vertical axis of the graph shown in FIG. 11 shows the displacement of the second weight in the X-axis direction.
  • the horizontal axis of the graph shows the elapsed time.
  • the said period is shown as a period of elapsed time 0 to elapsed time 400.
  • the plot PLT1 in the graph shown in FIG. 11 is a plot of teacher data. Further, the plot PLT6 in the graph is a plot of output data. As shown in FIG. 11, the degree of agreement between the output data output from the machine learning device 1 during online learning and the teacher data is not so high. This is because the machine learning device 1 is learning online.
  • FIG. 12 is output from the machine learning device 1 in the period after the machine learning device 1 is made to machine learn the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG.
  • the vertical axis of the graph shown in FIG. 12 shows the displacement of the second weight in the X-axis direction.
  • the horizontal axis of the graph shows the elapsed time.
  • the said period is shown as a period of elapsed time 400 to elapsed time 800.
  • the plot PLT1 in the graph shown in FIG. 12 is a plot of teacher data. Further, the plot PLT7 in the graph is a plot of output data. As shown in FIG. 12, the degree of agreement between the output data output from the machine learning device 1 after the online learning and the teacher data is higher than that before the online learning. Further, as shown in FIG. 12, the degree of coincidence between the output data output from the machine learning device 1 during online learning in the example shown in FIG. 12 and the teacher data is during online learning in the example shown in FIG. Compared with the output data and the teacher data output from the machine learning device 1 of the above, there is not much change. This is because even if the number of samples in the example shown in FIG. 12 is one-fifth of the number of intermediate nodes in the example shown in FIG. 10, the accuracy of online learning performed by the machine learning device 1 is high. Means.
  • the machine learning device 1 can improve the accuracy of online learning while reducing the number of samples by the ensemble FORCE learning and the weight update process by the ensemble Kalman filter method. As a result, the machine learning device 1 can achieve both a reduction in manufacturing cost and an improvement in machine learning accuracy.
  • the machine learning device uses a recursive neural network having a plurality of nodes connected to each other by weighted edges, and input data of one dimension or more arranged in a predetermined order.
  • a recursive neural network is an input layer having one or more input nodes, an intermediate layer having one or more intermediate nodes, and an output layer having one or more output nodes.
  • the input node, the intermediate node, and the output node are different nodes among a plurality of nodes, and the weight assigned to each edge connecting the intermediate nodes has a predetermined size.
  • the machine learning device performs output data generation processing and weight update processing each time the input layer receives input data of one dimension or more in a predetermined order, and the output data generation processing is input.
  • the first process of outputting the input data received by the layer from the input layer to the intermediate layer, and the intermediate data of one dimension or more corresponding to the input data input to the intermediate layer by the first process are output from the intermediate layer to the output layer.
  • the first process, the second process, and the third process are the second process of the process and the third process of generating the output data of one dimension or more corresponding to the intermediate data of one dimension or more input to the output layer by the second process.
  • the weight update process is performed in the order of processing, and the weight update process uses a vector having an estimated value for the weight assigned to each edge connecting the intermediate node and the output node as a component as an estimated weight vector, and outputs data of one dimension or more.
  • a vector having a predicted value for is used as a predicted output vector, and based on two or more estimated weight vectors having different components and a predicted output vector calculated for each of two or more estimated weight vectors.
  • This is a process of calculating a Kalman gain matrix and updating the weights assigned to each edge connecting the intermediate node and the output node based on the calculated Kalman gain matrix.
  • the machine learning device causes numerical instability due to the quantization error in the recurrent neural network accompanied by the matrix calculation for calculating the Kalman gain matrix without increasing the number of quantization bits. It is possible to suppress the storage.
  • the machine learning device calculates two or more predicted weight vectors based on two or more estimated weight vectors in the weight update process, and a predicted weight error ensemble vector based on the calculated two or more predicted weight vectors and two or more predicted weight errors ensemble vectors.
  • a configuration may be used in which the predicted output error ensemble vector based on the predicted output vector is calculated, and the Kalman gain matrix is calculated based on the calculated predicted weight error ensemble vector and the calculated predicted output error ensemble vector.
  • the output layer has one output node
  • the predicted output vector is a vector having predicted values for one-dimensional output data as components
  • the Kalman gain matrix has a plurality of rows.
  • a configuration that is a one-column matrix may be used.
  • the intermediate layer is a reservoir
  • the machine learning device may use a configuration in which at least the weight update process is performed by hardware including at least one of the near memory and the memory logic.
  • a program for realizing the function of an arbitrary component in the device (for example, machine learning device 1) described above is recorded on a computer-readable recording medium, and the program is read into a computer system and executed. You may try to do it.
  • the term "computer system” as used herein includes hardware such as an OS (Operating System) and peripheral devices.
  • the "computer-readable recording medium” refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD (Compact Disk) -ROM, or a storage device such as a hard disk built in a computer system. ..
  • a "computer-readable recording medium” is a volatile memory (RAM) inside a computer system that serves as a server or client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line.
  • RAM volatile memory
  • the above program may be transmitted from a computer system in which this program is stored in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium.
  • the "transmission medium" for transmitting a program means a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
  • the above program may be for realizing a part of the above-mentioned functions.
  • the above program may be a so-called difference file (difference program) that can realize the above-mentioned functions in combination with a program already recorded in the computer system.
  • Machine learning device 11 ... Arithmetic logic unit, 12 ... Memory, 13 ... Network interface, L1 ... Input layer, L2 ... Intermediate layer, L3 ... Output layer

Abstract

A machine learning device wherein a recurrent neural network has an input layer, an intermediate layer, and an output layer, weights assigned to the respective edges linking intermediate nodes are fixed at a predetermined size, and the machine learning device carries out an output data generation process and a weight updating process every time the input layer receives input data of one or more dimensions in a predetermined sequence. The weight updating process is a process for calculating a Kalman gain matrix in an ensemble Kalman filter method on the basis of two or more estimation weight vectors having mutually different components and a predicted output vector calculated for each of the two or more estimation weight vectors, and updating the weights assigned to the respective edges linking the intermediate nodes and an output node, on the basis of the calculated Kalman gain matrix.

Description

機械学習装置、機械学習プログラム、及び機械学習方法Machine learning devices, machine learning programs, and machine learning methods
 本発明は、機械学習装置、機械学習プログラム、及び機械学習方法に関する。 The present invention relates to a machine learning device, a machine learning program, and a machine learning method.
 時系列データの機械学習を行うニューラルネットワークにおいて、重みを更新する方法として拡張カルマンフィルタ法を用いた方法が知られている(非特許文献1参照)。 In a neural network that performs machine learning of time series data, a method using the extended Kalman filter method is known as a method of updating weights (see Non-Patent Document 1).
 一方、エッジデバイス等の機器へニューラルネットワークを実装するため、ニューラルネットワークをハードウェアにより実現する技術についても研究、開発が行われている。 On the other hand, in order to implement neural networks on devices such as edge devices, research and development are also being conducted on technologies for realizing neural networks by hardware.
 しかしながら、拡張カルマンフィルタ法を用いて重みを更新するニューラルネットワークは、カルマンゲイン行列を算出するための行列計算において、量子化誤差に起因した数値的不安定性が生じる場合があった。このような数値的不安定性が生じてしまうことは、量子化のビット数を大きくすることにより抑制することができる。しかしながら、ニューラルネットワークのハードウェアによる実現には、量子化のビット数は、小さな方が望ましい。このような事情から、カルマンゲイン行列を算出するための行列計算を伴う従来のニューラルネットワークでは、量子化のビット数を大きくすることなく、量子化誤差に起因した数値的不安定性が生じてしまうことを抑制することが困難な場合があった。 However, the neural network that updates the weights using the extended Kalman filter method may cause numerical instability due to quantization error in the matrix calculation for calculating the Kalman gain matrix. The occurrence of such numerical instability can be suppressed by increasing the number of quantization bits. However, in order to realize the neural network by hardware, it is desirable that the number of quantization bits is small. Under these circumstances, in a conventional neural network that involves matrix calculation for calculating the Kalman gain matrix, numerical instability due to quantization error occurs without increasing the number of quantization bits. It was sometimes difficult to control.
 本発明の一態様は、重みが割り当てられたエッジによって互いに結合された複数のノードを有する再帰型ニューラルネットワークを用いて、予め決められた順に並ぶ1次元以上の入力データの機械学習を行う機械学習装置であって、前記再帰型ニューラルネットワークは、1以上の入力ノードを有する入力層と、1以上の中間ノードを有する中間層と、1以上の出力ノードを有する出力層と、を有し、前記入力ノードと、前記中間ノードと、前記出力ノードとは、前記複数のノードのうちの互いに異なるノードであり、前記中間ノード同士を結合するエッジそれぞれに割り当てられた重みは、予め決められた大きさに固定されており、前記機械学習装置は、前記入力層が前記1次元以上の入力データを前記予め決められた順に受け付ける毎に、出力データ生成処理と重み更新処理とを行い、前記出力データ生成処理は、前記入力層により受け付けた前記入力データを前記入力層から前記中間層に出力する第1処理と、前記第1処理により前記中間層に入力された前記入力データに応じた1次元以上の中間データを、前記中間層から前記出力層に出力する第2処理と、前記第2処理により前記出力層に入力された前記1次元以上の中間データに応じた1次元以上の出力データを生成する第3処理と、を前記第1処理、前記第2処理、前記第3処理の順に行う処理であり、前記重み更新処理は、前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みについての推定値を成分として有するベクトルを推定重みベクトルとし、前記1次元以上の出力データについての予測値を成分として有するベクトルを予測出力ベクトルとし、互いに成分が異なる2以上の前記推定重みベクトルと、前記2以上の前記推定重みベクトル毎に算出される前記予測出力ベクトルとに基づいて、アンサンブルカルマンフィルタ法におけるカルマンゲイン行列を算出し、算出した前記カルマンゲイン行列に基づいて、前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みを更新する処理である、機械学習装置である。 One aspect of the present invention is machine learning that performs machine learning of input data of one dimension or more arranged in a predetermined order by using a recursive neural network having a plurality of nodes connected to each other by weighted edges. A device, the recursive neural network, comprising an input layer having one or more input nodes, an intermediate layer having one or more intermediate nodes, and an output layer having one or more output nodes. The input node, the intermediate node, and the output node are different nodes from the plurality of nodes, and the weight assigned to each edge connecting the intermediate nodes has a predetermined size. The machine learning device performs output data generation processing and weight update processing each time the input layer receives input data of one dimension or more in a predetermined order, and generates the output data. The processing includes a first process of outputting the input data received by the input layer from the input layer to the intermediate layer, and one or more dimensions corresponding to the input data input to the intermediate layer by the first process. A second process of outputting the intermediate data from the intermediate layer to the output layer, and a one-dimensional or higher output data corresponding to the one-dimensional or higher intermediate data input to the output layer by the second process are generated. The third process is a process of performing the first process, the second process, and the third process in this order, and the weight update process is assigned to each edge connecting the intermediate node and the output node. A vector having an estimated value for weight as a component is used as an estimated weight vector, a vector having a predicted value for output data of one dimension or more as a component is used as a predicted output vector, and two or more estimated weight vectors having different components from each other. , The Kalman gain matrix in the ensemble Kalman filter method is calculated based on the predicted output vector calculated for each of the two or more estimated weight vectors, and the intermediate node and the output are based on the calculated Kalman gain matrix. It is a machine learning device that updates the weight assigned to each edge that connects to a node.
 本発明によれば、カルマンゲイン行列を算出するための行列計算を伴う再帰型ニューラルネットワークにおいて、量子化のビット数を大きくすることなく、量子化誤差に起因する数値的不安定性が生じてしまうことを抑制することができる。 According to the present invention, in a recurrent neural network accompanied by matrix calculation for calculating a Kalman gain matrix, numerical instability due to quantization error occurs without increasing the number of quantization bits. Can be suppressed.
実施形態に係る機械学習装置1の構成の一例を示す図である。It is a figure which shows an example of the structure of the machine learning apparatus 1 which concerns on embodiment. 実施形態に係る再帰型ニューラルネットワークの構成の一例を示す図である。It is a figure which shows an example of the structure of the recurrent neural network which concerns on embodiment. 機械学習装置1が行う重み更新処理の流れの一例を示す図である。It is a figure which shows an example of the flow of the weight update process performed by the machine learning apparatus 1. アンサンブルカルマンフィルタ法に基づく重み更新処理におけるデータフローの全体構成の一例を示す図である。It is a figure which shows an example of the whole structure of the data flow in the weight update processing based on the ensemble Kalman filter method. ブロックB3内部のデータフローの最も単純な具体例を示す図である。It is a figure which shows the simplest concrete example of the data flow in a block B3. 原点から長さl1の棒によって繋がれた質量m1の第1錘と、当該錘と長さl2の棒によって繋がれた質量m2の第2錘とによって構成される二重振り子の一例を示す図である。The figure which shows an example of the double pendulum composed of the 1st weight of mass m1 connected by the rod of length l1 from the origin, and the 2nd weight of mass m2 connected by the rod of length l2. Is. 図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させている期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの一例を示す図である。The temporal change of the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the plotted graph. 図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させた後の期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの一例を示す図である。The temporal change of the output data output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the graph which plotted. 図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させている期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの他の例を示す図である。The temporal change of the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows the other example of the plotted graph. 図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させた後の期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの他の例を示す図である。The temporal change of the output data output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows another example of the graph which plotted. 図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させている期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの更に他の例を示す図である。The temporal change of the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows still another example of the plotted graph. 図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させた後の期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの更に他の例を示す図である。The temporal change of the output data output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows still another example of the graph which plotted.
 <実施形態>
 以下、本発明の実施形態について、図面を参照して説明する。
<Embodiment>
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
 <機械学習装置の構成>
 まず、図1を参照し、実施形態に係る機械学習装置1の構成について説明する。図1は、実施形態に係る機械学習装置1の構成の一例を示す図である。
<Configuration of machine learning device>
First, the configuration of the machine learning device 1 according to the embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an example of the configuration of the machine learning device 1 according to the embodiment.
 機械学習装置1は、p次元の入力データの機械学習を行う。pは、1以上の整数であれば、如何なる整数であってもよい。機械学習装置1は、このような機械学習を、複数のノードを有する再帰型ニューラルネットワークを用いて行う。なお、当該再帰型ニューラルネットワークでは、当該複数のノードは、重みが割り当てられたエッジによって互いに結合されている。 The machine learning device 1 performs machine learning of p-dimensional input data. p may be any integer as long as it is an integer of 1 or more. The machine learning device 1 performs such machine learning by using a recurrent neural network having a plurality of nodes. In the recurrent neural network, the plurality of nodes are connected to each other by weighted edges.
 ここで、p次元の入力データは、互いに相関を持つデータである。また、p次元の入力データは、予め決められた順に並ぶデータである。以下では、一例として、予め決められた順が時系列順である場合について説明する。この場合、p次元の入力データは、p次元の時系列データである。p次元の時系列データは、例えば、p個のセンサから時系列順に取得されたデータ等である。なお、p個のセンサは、p種類のセンサであってもよく、一部又は全部が互いに同じ種類のp個のセンサであってもよい。また、予め決められた順は、時系列順に代えて、空間的に並べられた順等の他の順であってもよい。 Here, the p-dimensional input data is data that correlates with each other. Further, the p-dimensional input data is data arranged in a predetermined order. In the following, as an example, a case where the predetermined order is in chronological order will be described. In this case, the p-dimensional input data is p-dimensional time series data. The p-dimensional time series data is, for example, data acquired from p sensors in chronological order. The p-sensors may be p-type sensors, or some or all of them may be p-sensors of the same type. Further, the predetermined order may be another order such as a spatially arranged order instead of the time series order.
 以下では、説明の便宜上、時系列順を示す時刻を、離散化された時刻kによって示す。kは、整数である。なお、時系列順を示す時刻kが連続変数である場合、kは、実数等の他の数であってもよい。 In the following, for convenience of explanation, the time indicating the time series order is indicated by the discretized time k. k is an integer. When the time k indicating the time series order is a continuous variable, k may be another number such as a real number.
 ここで、図2に示すように、実施形態に係る再帰型ニューラルネットワークは少なくとも、入力層L1と、中間層L2と、出力層L3を有する。図2は、実施形態に係る再帰型ニューラルネットワークの構成の一例を示す図である。以下では、説明の便宜上、実施形態に係る再帰型ニューラルネットワークのことを、アンサンブルFORCE学習器と称して説明する。 Here, as shown in FIG. 2, the recurrent neural network according to the embodiment has at least an input layer L1, an intermediate layer L2, and an output layer L3. FIG. 2 is a diagram showing an example of the configuration of the recurrent neural network according to the embodiment. Hereinafter, for convenience of explanation, the recurrent neural network according to the embodiment will be referred to as an ensemble FORCE learner.
 なお、あるニューラルネットワークにおいて、各ノードは、当該ニューラルネットワークにおいて流れるデータ同士の演算そのものを意味する。このため、各ノードは、ソフトウェアによって実現された当該ニューラルネットワークでは、当該演算を行う関数を意味する。また、各ノードは、ハードウェアによって実現された当該ニューラルネットワークでは、当該演算を行う素子を意味する。 Note that in a certain neural network, each node means the operation itself between the data flowing in the neural network. Therefore, each node means a function that performs the operation in the neural network realized by software. In addition, each node means an element that performs the calculation in the neural network realized by hardware.
 また、あるニューラルネットワークにおいて、あるノードN1から他のノードN2の間を接続するエッジは、ノードN1からノードN2へのデータの流れを示す。ノードN1からノードN2へ流れるデータには、ノードN1とノードN2との間を接続するエッジに割り当てられた重みが乗算される。すなわち、ノードN2には、エッジを通ることによって当該重みが乗算された後の当該データが当該エッジから入力される。このため、ソフトウェアによって実現された当該ニューラルネットワークでは、当該エッジは、このような重みの乗算を行う関数を意味する。また、当該エッジは、ハードウェアによって実現された当該ニューラルネットワークでは、このような重みの乗算を行う素子を意味する。 Further, in a certain neural network, an edge connecting between a certain node N1 and another node N2 indicates a data flow from the node N1 to the node N2. The data flowing from node N1 to node N2 is multiplied by the weight assigned to the edge connecting between node N1 and node N2. That is, the data after the weight is multiplied by passing through the edge is input to the node N2 from the edge. Therefore, in the neural network realized by software, the edge means a function that performs such weight multiplication. Further, the edge means an element that performs such weight multiplication in the neural network realized by hardware.
 入力層L1は、入力ノードを有する。ここで、入力層L1は、P次元の入力データの次元数と同じ数の入力ノードを有してもよく、P次元の入力データの次元数と異なる数の入力ノードを有してもよい。なお、P次元の入力データの次元数と異なる数の入力ノードを入力層L1が有する場合、これらの入力ノードの数は、P個よりも少ない数であってもよく、P個よりも多い数であってもよい。そして、当該場合、これらの入力ノードには、例えば、P次元の入力データの重み付き線形和が入力される。以下では、一例として、入力層L1が、P個の入力ノードを有する場合について説明する。この場合、ある入力ノードは、入力データのうち当該入力ノードに対応付けられた入力データを受け付ける。換言すると、P個の入力ノードのうちp番目の入力ノードは、P次元の時系列データのうちp番目の入力データを受け付ける。ここで、pは、1以上P以下の整数のうちのいずれかの整数である。すなわち、pは、P個の入力ノードのそれぞれを互いに識別する数(ラベル)であるとともに、P個の入力データのそれぞれを互いに識別する数(ラベル)でもある。入力層L1は、P個の入力ノードにより受け付けたP次元の入力データのそれぞれを中間層L2に出力する。 The input layer L1 has an input node. Here, the input layer L1 may have the same number of input nodes as the number of dimensions of the P-dimensional input data, or may have a number of input nodes different from the number of dimensions of the P-dimensional input data. When the input layer L1 has a number of input nodes different from the number of dimensions of the P-dimensional input data, the number of these input nodes may be less than P and more than P. It may be. Then, in this case, for example, a weighted linear sum of P-dimensional input data is input to these input nodes. In the following, as an example, a case where the input layer L1 has P input nodes will be described. In this case, a certain input node accepts the input data associated with the input node among the input data. In other words, the p-th input node among the P input nodes receives the p-th input data among the P-dimensional time series data. Here, p is any integer of 1 or more and P or less. That is, p is a number (label) that identifies each of the P input nodes with each other, and is also a number (label) that identifies each of the P input data with each other. The input layer L1 outputs each of the P-dimensional input data received by the P input nodes to the intermediate layer L2.
 また、中間層L2は、複数の中間ノードを有する。また、中間層L2は、入力層L1が出力するP次元の入力データのそれぞれを受け付ける。より具体的には、中間層L2は、複数の中間ノードのうちの一部又は全部により、入力層L1が出力するP次元の入力データのそれぞれを受け付ける。中間層は、受け付けたP次元の入力データに応じたQ次元の中間データを出力層L3に出力する。Qは、1以上の整数であれば、如何なる数であってもよい。このため、中間層L2は少なくとも、Q次元の中間データのそれぞれを出力層L3に出力するQ個の中間ノードを有する。ここで、これらQ個の中間ノードのうちq番目の中間ノードは、Q次元の中間データのうちq番目の中間データを出力層L3に出力する。qは、1以上Q以下の整数のうちのいずれかの整数である。qは、Q個の中間ノードのそれぞれを互いに識別する数(ラベル)であるとともに、Q次元の中間データのそれぞれを互いに識別する数(ラベル)でもある。 Further, the intermediate layer L2 has a plurality of intermediate nodes. Further, the intermediate layer L2 receives each of the P-dimensional input data output by the input layer L1. More specifically, the intermediate layer L2 receives each of the P-dimensional input data output by the input layer L1 by a part or all of the plurality of intermediate nodes. The intermediate layer outputs Q-dimensional intermediate data corresponding to the received P-dimensional input data to the output layer L3. Q may be any number as long as it is an integer of 1 or more. Therefore, the intermediate layer L2 has at least Q intermediate nodes that output each of the Q-dimensional intermediate data to the output layer L3. Here, the qth intermediate node among these Q intermediate nodes outputs the qth intermediate data among the Q-dimensional intermediate data to the output layer L3. q is any integer of 1 or more and Q or less. q is a number (label) that identifies each of the Q intermediate nodes from each other, and is also a number (label) that identifies each of the Q-dimensional intermediate data from each other.
 ここで、ある中間ノードは、1以上の入力データを受け付けた場合、受け付けた1以上の入力データの総和を第1活性化関数へ入力した場合に得られる出力値を生成する。なお、第1活性化関数は、非線形関数であれば、如何なる関数であってもよい。そして、当該中間ノードは、生成した当該出力値を、エッジにより当該中間ノードに結合されている他のノードへと出力する。当該中間ノードが前述のQ個の中間ノードのうちのいずれかである場合、生成した当該出力値は、中間データとして出力層L3に出力される。中間層L2が有する個々の中間ノードは、このような出力値の生成を行う。なお、中間ノードが行う処理のうちバイアスの加算等の他の処理については、説明を省略する。 Here, when one or more input data is received, a certain intermediate node generates an output value obtained when the sum of the received one or more input data is input to the first activation function. The first activation function may be any function as long as it is a non-linear function. Then, the intermediate node outputs the generated output value to another node connected to the intermediate node by the edge. When the intermediate node is one of the Q intermediate nodes described above, the generated output value is output to the output layer L3 as intermediate data. The individual intermediate nodes of the intermediate layer L2 generate such an output value. Of the processes performed by the intermediate node, other processes such as bias addition will not be described.
 中間層L2は、例えば、リザボアコンピューティングにおけるリザボアである。このため、中間層L2内における重みは、事前に乱数によって決定されている。そして、中間層L2内における重みの更新は、行われない。換言すると、中間ノード同士を結合するエッジに割り当てられた重みは、予め決められた大きさ(すなわち、乱数によって決定された大きさ)に固定されている。なお、中間層L2は、リザボアに代えて、層内における重みの更新が行われない他の中間層であってもよい。 The intermediate layer L2 is, for example, a reservoir in reservoir computing. Therefore, the weight in the intermediate layer L2 is determined in advance by a random number. Then, the weight is not updated in the intermediate layer L2. In other words, the weight assigned to the edge connecting the intermediate nodes is fixed to a predetermined size (that is, a size determined by a random number). In addition, the intermediate layer L2 may be another intermediate layer in which the weight is not updated in the layer instead of the reservoir.
 出力層L3は、R個の出力ノードを有する。ここで、Rは、1以上の整数であれば、如何なる整数であってもよい。出力層L3は、これらR個の出力ノードにより、中間層L2からQ次元の中間データを受け付ける。出力層L3は、受け付けたQ次元の中間データに応じたR次元の出力データを生成して出力する。すなわち、R個の出力ノードのうちr番目の出力ノードは、R次元の出力データのうちr番目の出力データを生成する。rは、1以上R以下の整数のうちのいずれかの整数である。rは、R個の出力ノードのそれぞれを互いに識別する数(ラベル)であるとともに、R次元の出力データのそれぞれを互いに識別する数(ラベル)でもある。 The output layer L3 has R output nodes. Here, R may be any integer as long as it is an integer of 1 or more. The output layer L3 receives Q-dimensional intermediate data from the intermediate layer L2 by these R output nodes. The output layer L3 generates and outputs R-dimensional output data corresponding to the received Q-dimensional intermediate data. That is, the r-th output node among the R output nodes generates the r-th output data among the R-dimensional output data. r is any integer of 1 or more and R or less. r is a number (label) that identifies each of the R output nodes from each other, and is also a number (label) that identifies each of the R-dimensional output data from each other.
 ここで、ある出力ノードは、1以上の中間データを受け付けた場合、受け付けた1以上の中間データの総和を第2活性化関数へ入力した場合に得られる出力値を生成する。第2活性化関数については、後述する。これにより、当該出力ノードは、当該出力値を出力データとして出力する。出力層L3が有する出力ノードは、このような出力値の生成を行う。なお、出力ノードが行う処理のうちバイアスの加算、当該出力値の出力等の他の処理については、説明を省略する。 Here, when one or more intermediate data is received, a certain output node generates an output value obtained when the sum of the received one or more intermediate data is input to the second activation function. The second activation function will be described later. As a result, the output node outputs the output value as output data. The output node of the output layer L3 generates such an output value. Of the processes performed by the output node, other processes such as bias addition and output of the output value will not be described.
 このように、アンサンブルFORCE学習器は、この一例において、リザボアである中間層L2を有する。このため、アンサンブルFORCE学習器は、この一例では、リザボアコンピューティングの一種である。 As described above, the ensemble FORCE learner has an intermediate layer L2 which is a reservoir in this example. Therefore, the ensemble FORCE learner is a kind of reservoir computing in this example.
 なお、入力ノードと、中間ノードと、出力ノードとは、アンサンブルFORCE学習器が有する複数のノードのうちの互いに異なるノードであり、重複することはない。 Note that the input node, the intermediate node, and the output node are different nodes among the plurality of nodes of the ensemble FORCE learner, and do not overlap with each other.
 ここで、ある入力ノードX11からある中間ノードX12へあるデータD1が出力される場合、データD1には、入力ノードX11と中間ノードX12とを結合するエッジに割り当てられた重みが乗算される。そして、当該重みが乗算された後のデータD1が中間ノードX12へと入力される。 Here, when the data D1 from the input node X11 to the intermediate node X12 is output, the data D1 is multiplied by the weight assigned to the edge connecting the input node X11 and the intermediate node X12. Then, the data D1 after the weight is multiplied is input to the intermediate node X12.
 また、ある中間ノードX21から他の中間ノードX22へあるデータD2が出力される場合、データD2には、中間ノードX21と中間ノードX22とを結合するエッジに割り当てられた重みが乗算される。そして、当該重みが乗算された後のデータD2が中間ノードX22へと入力される。 Further, when some data D2 is output from a certain intermediate node X21 to another intermediate node X22, the data D2 is multiplied by the weight assigned to the edge connecting the intermediate node X21 and the intermediate node X22. Then, the data D2 after the weight is multiplied is input to the intermediate node X22.
 また、ある中間ノードX31からある出力ノードX32へあるデータD3が出力される場合、データD3には、中間ノードX31と出力ノードX32とを結合するエッジに割り当てられた重みが乗算される。そして、当該重みが乗算された後のデータD3が出力ノードX32へと入力される。 Further, when the data D3 from the intermediate node X31 to the output node X32 is output, the data D3 is multiplied by the weight assigned to the edge connecting the intermediate node X31 and the output node X32. Then, the data D3 after the weight is multiplied is input to the output node X32.
 また、中間層L2内における重みが更新されないため、アンサンブルFORCE学習では、重みの更新は、中間ノードと出力ノードとを結合するエッジそれぞれに割り当てられた重みについて行われる。また、重みの更新は、入力ノードと中間ノードとを結合するエッジに割り当てられた重みについて行われない。そこで、以下では、説明の便宜上、中間ノードと出力ノードとを結合するエッジそれぞれに割り当てられた重みを、まとめて更新対象重みと称して説明する。また、以下では、説明の便宜上、更新対象重みの数を、Lによって表す。Lは、2以上の整数であれば、如何なる数であってもよい。 Further, since the weights in the intermediate layer L2 are not updated, in the ensemble FORCE learning, the weights are updated for the weights assigned to each edge connecting the intermediate node and the output node. Also, the weights are not updated for the weights assigned to the edges that connect the input node and the intermediate node. Therefore, in the following, for convenience of explanation, the weights assigned to the edges connecting the intermediate node and the output node will be collectively referred to as update target weights. In the following, for convenience of explanation, the number of weights to be updated is represented by L. L may be any number as long as it is an integer of 2 or more.
 なお、図2に示した「○」はそれぞれ、ノードを示している。すなわち、入力層L1に含まれている「○」はそれぞれ、入力ノードを示す。また、中間層L2に含まれている「○」はそれぞれ、中間ノードを示す。また、出力層L3に含まれている「○」はそれぞれ、出力ノードを示す。 Note that each "○" shown in FIG. 2 indicates a node. That is, each "◯" included in the input layer L1 indicates an input node. Further, each "○" included in the intermediate layer L2 indicates an intermediate node. Further, "○" included in the output layer L3 indicates an output node, respectively.
 また、図2に示したノード間を結合する矢印は、アンサンブルFORCE学習器における各ノード間のエッジによる接続態様のイメージを分かりやすく表すために描かれているものであり、実際のアンサンブルFORCE学習器における各ノード間のエッジによる接続態様とは異なる。 Further, the arrows connecting the nodes shown in FIG. 2 are drawn to clearly represent the image of the connection mode by the edges between the nodes in the ensemble FORCE learner, and the actual ensemble FORCE learner. It is different from the connection mode by the edge between each node in.
 また、入力層L1への入力データの入力と、出力層L3からの出力データの出力とについては、既知の方法によって行われてもよく、これから開発される方法によって行われてもよいため、説明を省略する。 Further, the input of the input data to the input layer L1 and the output of the output data from the output layer L3 may be performed by a known method or may be performed by a method to be developed in the future. Is omitted.
 機械学習装置1は、このようなアンサンブルFORCE学習を用いて、前述のP次元の入力データの機械学習を行う。より具体的には、機械学習装置1は、入力層L1がP次元の入力データを時系列順に受け付ける毎に(すなわち、当該入力データを予め決められた順に受け付ける毎に)、重み更新処理と、出力データ生成処理とを行う。 The machine learning device 1 uses such ensemble FORCE learning to perform machine learning of the above-mentioned P-dimensional input data. More specifically, the machine learning device 1 performs weight update processing every time the input layer L1 receives P-dimensional input data in chronological order (that is, every time the input data is received in a predetermined order). Performs output data generation processing.
 重み更新処理は、更新対象重みを更新する処理である。機械学習装置1は、出力データ生成処理を行う前に、重み更新処理を行う。すなわち、機械学習装置1は、入力層L1がP次元の入力データを時系列順に受け付ける毎に、更新対象重みを更新してから出力データ生成処理を行う。 The weight update process is a process for updating the update target weight. The machine learning device 1 performs a weight update process before performing the output data generation process. That is, each time the input layer L1 receives P-dimensional input data in chronological order, the machine learning device 1 updates the update target weight and then performs the output data generation process.
 また、重み更新処理は、アンサンブルカルマンフィルタ法に基づいて更新対象重みを更新する処理である。従来、アンサンブルカルマンフィルタ法に基づく重みの更新を行う場合、当該更新を行うニューラルネットワークの中間層を、アンサンブルカルマンフィルタ法におけるサンプル数(粒子数)と同じ個数だけ用意する必要がある。このため、従来では、当該場合、計算コストが増大してしまうという問題があった。一方、アンサンブルFORCE学習では、中間層L2内の重みは、更新されない。このため、アンサンブルFORCE学習では、中間層L2の個数を1個に保ったままアンサンブルカルマンフィルタ法に基づく重みの更新を行うことができる。その結果、機械学習装置1は、計算コストの増大を抑制することができる。また、アンサンブルカルマンフィルタ法に基づく重みの更新では、カルマンゲイン行列を算出するための行列計算を伴う他の方法による重みの更新と比べて、逆行列の計算コストが少ない。その結果、機械学習装置1は、量子化のビット数を大きくすることなく、量子化誤差に起因する数値的不安定性が生じてしまうことを抑制することができる The weight update process is a process for updating the update target weight based on the ensemble Kalman filter method. Conventionally, when updating weights based on the ensemble Kalman filter method, it is necessary to prepare the same number of intermediate layers of the neural network for updating as the number of samples (number of particles) in the ensemble Kalman filter method. Therefore, in the past, there has been a problem that the calculation cost increases in this case. On the other hand, in the ensemble FORCE learning, the weights in the intermediate layer L2 are not updated. Therefore, in the ensemble FORCE learning, the weight can be updated based on the ensemble Kalman filter method while keeping the number of the intermediate layers L2 at one. As a result, the machine learning device 1 can suppress an increase in calculation cost. In addition, the weight update based on the ensemble Kalman filter method has a lower inverse matrix calculation cost than the weight update by another method involving matrix calculation for calculating the Kalman gain matrix. As a result, the machine learning device 1 can suppress the occurrence of numerical instability due to the quantization error without increasing the number of quantization bits.
 より具体的には、重み更新処理は、互いに成分が異なるM個の推定重みベクトルと、M個の推定重みベクトル毎に算出される予測出力ベクトルとに基づいて、アンサンブルカルマンフィルタ法におけるカルマンゲイン行列を算出し、算出したカルマンゲイン行列に基づいて、更新対象重みを更新する処理である。Mは、アンサンブルカルマンフィルタ法におけるサンプル数である。Mは、2以上の整数であれば、如何なる数であってもよい。 More specifically, the weight update process obtains a Kalman gain matrix in the ensemble Kalman filter method based on M estimated weight vectors having different components and predicted output vectors calculated for each of M estimated weight vectors. This is a process of calculating and updating the update target weight based on the calculated Kalman gain matrix. M is the number of samples in the ensemble Kalman filter method. M may be any number as long as it is an integer of 2 or more.
 ここで、推定重みベクトルは、更新対象重みに含まれる個々の重みについての推定値を成分として有するベクトルのことである。以下では、説明の便宜上、当該推定値を、推定重みと称して説明する。更新対象重みの数がL(すなわち、推定重みの数がL)であるため、推定重みベクトルは、L次元のベクトルである。なお、推定重みの初期値は、乱数によりランダムに決定される。 Here, the estimated weight vector is a vector having an estimated value for each weight included in the update target weight as a component. Hereinafter, for convenience of explanation, the estimated value will be referred to as an estimated weight. Since the number of weights to be updated is L (that is, the number of estimated weights is L), the estimated weight vector is an L-dimensional vector. The initial value of the estimated weight is randomly determined by a random number.
 また、予測出力ベクトルは、R次元の出力データそれぞれについての予測値を成分として有するベクトルのことである。すなわち、予測出力ベクトルは、R次元のベクトルである。ある時刻kの予測出力ベクトルは、時刻kの推定重みベクトルに基づいて算出される。 Further, the predicted output vector is a vector having predicted values for each of the R-dimensional output data as components. That is, the predicted output vector is an R-dimensional vector. The predicted output vector at a certain time k is calculated based on the estimated weight vector at the time k.
 重み更新処理の詳細については、後述する。 The details of the weight update process will be described later.
 出力データ生成処理は、重み更新処理によって更新対象重みが更新された後に、更新された後の更新対象重みを用いて行われる処理である。出力データ生成処理は、第1処理と、第2処理と、第3処理とを、第1処理、第2処理、第3処理の順に行う処理である。 The output data generation process is a process performed after the update target weight is updated by the weight update process, and then the update target weight is used. The output data generation process is a process in which the first process, the second process, and the third process are performed in the order of the first process, the second process, and the third process.
 第1処理は、入力層L1により受け付けたP次元の入力データを入力層L1から中間層L2に出力する処理である。 The first process is a process of outputting the P-dimensional input data received by the input layer L1 from the input layer L1 to the intermediate layer L2.
 第2処理は、第1処理により中間層L2に入力されたp次元の入力データに応じたQ次元の中間データを、中間層L2から出力層L3に出力する処理である。 The second process is a process of outputting Q-dimensional intermediate data corresponding to the p-dimensional input data input to the intermediate layer L2 by the first process from the intermediate layer L2 to the output layer L3.
 第3処理は、第2処理により出力層L3に入力されたQ次元以上の中間データに応じたR次元の出力データを生成する処理である。 The third process is a process of generating R-dimensional output data corresponding to Q-dimensional or higher intermediate data input to the output layer L3 by the second process.
 出力データ生成処理については、リザボアコンピューティングにおいて出力データを生成する処理と同様の処理である。このため、出力データ生成処理については、これ以上の詳細な説明を省略する。 The output data generation process is the same as the process for generating output data in Reservoir Computing. Therefore, further detailed description of the output data generation process will be omitted.
 図1に戻る。機械学習装置1は、演算装置11と、メモリ12と、ネットワークインターフェース13を備える。なお、機械学習装置1は、これらに加えて、他の回路、他の装置を備える構成であってもよい。例えば、機械学習装置1は、キーボード、マウス等の入力装置を備える構成であってもよい。また、例えば、機械学習装置1は、ディスプレイ等の出力装置を備える構成であってもよい。また、例えば、機械学習装置1は、当該入力装置と当該出力装置との少なくとも一方を接続するインタフェースを備える構成であってもよい。 Return to Fig. 1. The machine learning device 1 includes an arithmetic unit 11, a memory 12, and a network interface 13. In addition to these, the machine learning device 1 may be configured to include other circuits and other devices. For example, the machine learning device 1 may be configured to include an input device such as a keyboard and a mouse. Further, for example, the machine learning device 1 may be configured to include an output device such as a display. Further, for example, the machine learning device 1 may be configured to include an interface for connecting at least one of the input device and the output device.
 演算装置11は、プロセッサであり、例えば、FPGA(Field Programmable Gate Array)である。なお、演算装置11は、FPGAに代えて、CPU(Central Processing Unit)であってもよく、FPGAとCPUの組み合わせであってもよく、他のプロセッサであってもよい。 The arithmetic unit 11 is a processor, for example, an FPGA (Field Programmable Gate Array). The arithmetic unit 11 may be a CPU (Central Processing Unit) instead of the FPGA, may be a combination of the FPGA and the CPU, or may be another processor.
 この一例では、演算装置11は、FPGAである。このため、演算装置11は、FPGAが有するハードウェア(例えば、集積回路等)によって前述のアンサンブルFORCE学習を実現し、p次元の入力データについての機械学習を行う。なお、演算装置11がCPUである場合、演算装置11は、CPUが有するハードウェアと、CPUにより実行されるソフトウェアとの組み合わせによって当該機械学習を行う構成であってもよい。また、演算装置11は、後述するように、ニアメモリ、メモリロジック等によって構成されてもよい。換言すると、演算装置11は、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアによって構成されてもよい。 In this example, the arithmetic unit 11 is an FPGA. Therefore, the arithmetic unit 11 realizes the above-mentioned ensemble FORCE learning by the hardware (for example, an integrated circuit or the like) possessed by the FPGA, and performs machine learning on the p-dimensional input data. When the arithmetic unit 11 is a CPU, the arithmetic unit 11 may be configured to perform the machine learning by combining the hardware possessed by the CPU and the software executed by the CPU. Further, the arithmetic unit 11 may be configured by near memory, memory logic, or the like, as will be described later. In other words, the arithmetic unit 11 may be composed of hardware including at least one of near memory and memory logic.
 メモリ12は、例えば、演算装置11が用いる各種の情報を記憶する。メモリ12は、例えば、SSD(Solid State Drive)、HDD(Hard Disk Drive)、EEPROM(Electrically Erasable Programmable Read-Only Memory)、ROM(Read-Only Memory)、RAM(Random Access Memory)等を含む。なお、メモリ12は、演算装置11に内蔵されるものに代えて、USB等のデジタル入出力ポート等によって接続された外付け型の記憶装置であってもよい。 The memory 12 stores, for example, various information used by the arithmetic unit 11. The memory 12 includes, for example, SSD (Solid State Drive), HDD (Hard Disk Drive), EEPROM (Electrically Erasable Programmable Read-Only Memory), ROM (Read-Only Memory), RAM (Random Access Memory), and the like. The memory 12 may be an external storage device connected by a digital input / output port such as USB, instead of the one built in the arithmetic unit 11.
 ネットワークインターフェース13は、ネットワークを介して、センサ等の外部装置と接続するインタフェースである。 The network interface 13 is an interface that connects to an external device such as a sensor via a network.
 <重み更新処理>
 以下、機械学習装置1が行う重み更新処理について説明する。ここで、以下において説明する重み更新処理は、アンサンブルカルマンフィルタ法に基づく処理である。アンサンブルカルマンフィルタ法に基づく重み更新処理では、離散化された時刻kが示す時系列順に応じた逐次計算が行われる。このため、以下において説明する関数、ベクトル、行列等の引数として現われる時刻kは、このような逐次計算における時系列順を示す。なお、アンサンブルカルマンフィルタ法による以下の定式化は、一例に過ぎず、他の定式化であってもよい。
<Weight update process>
Hereinafter, the weight update process performed by the machine learning device 1 will be described. Here, the weight update process described below is a process based on the ensemble Kalman filter method. In the weight update process based on the ensemble Kalman filter method, sequential calculation is performed according to the time series order indicated by the discretized time k. Therefore, the time k that appears as an argument of the function, vector, matrix, etc. described below indicates the time series order in such sequential calculation. The following formulation by the ensemble Kalman filter method is only an example, and may be another formulation.
 ここで、アンサンブルFORCE学習器は、以下の式(1)及び式(2)に示した非線形ベクトル関数によって表すことができる。 Here, the ensemble FORCE learner can be represented by the nonlinear vector functions shown in the following equations (1) and (2).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 上記の式(1)におけるベクトルxは、重みベクトルを示す。重みベクトルは、更新対象重みを成分として有するベクトルのことである。すなわち、ベクトルxk+1は、時刻k+1の重みベクトルを示す。また、xは、時刻kの重みベクトルを示す。ベクトルηは、重みベクトルについてのモデル化誤差を示す重み誤差ベクトルである。すなわち、ベクトルηは、時刻kの重み誤差ベクトルを示す。ベクトルηは、時刻kの重みベクトルについてのモデル化誤差の誤差分布として何らかの誤差分布を仮定することによって得られる。当該誤差分布としては、例えば、ガウス分布等を採用することができる。なお、式(1)における右辺第1項は、ベクトルxと時刻kとを変数とする非線形関数であってもよい。 The vector x in the above equation (1) represents a weight vector. The weight vector is a vector having an update target weight as a component. That is, the vector x k + 1 indicates a weight vector at time k + 1. Further, x k indicates a weight vector at time k. The vector η is a weight error vector indicating a modeling error for the weight vector. That is, the vector η k indicates a weight error vector at time k. The vector η k is obtained by assuming some error distribution as the error distribution of the modeling error for the weight vector at time k. As the error distribution, for example, a Gaussian distribution or the like can be adopted. The first term on the right side in the equation (1) may be a non-linear function in which the vector x k and the time k are variables.
 また、上記の式(2)におけるベクトルyは、出力ベクトルを示す。出力ベクトルは、R次元の出力データを成分として有するベクトルのことである。すなわち、ベクトルyは、時刻kの出力ベクトルを示す。ベクトルζは、出力ベクトルについてのモデル化誤差を示す出力誤差ベクトルである。すなわち、ベクトルζは、時刻kの出力誤差ベクトルを示す。ベクトルζは、時刻kの出力誤差ベクトルについてのモデル化誤差の誤差分布として何らかの誤差分布を仮定することによって得られる。当該誤差分布としては、例えば、ガウス分布等を採用することができる。関数hは、前述の第2活性化関数である。より具体的には、関数hは、2変数関数であり、例えば、シグモイド関数、双曲線正接関数、線形関数、Relu等である。 Further, the vector y in the above equation (2) indicates an output vector. The output vector is a vector having R-dimensional output data as a component. That is, the vector y k indicates the output vector at time k. The vector ζ is an output error vector indicating a modeling error for the output vector. That is, the vector ζ k indicates the output error vector at time k. The vector ζ k is obtained by assuming some error distribution as the error distribution of the modeling error for the output error vector at time k. As the error distribution, for example, a Gaussian distribution or the like can be adopted. The function h is the above-mentioned second activation function. More specifically, the function h is a two-variable function, such as a sigmoid function, a bicurve tangent function, a linear function, and Relu.
 ここで、アンサンブルカルマンフィルタ法に基づく重みの更新では、M個の重みベクトルを、M個のサンプルとして扱う。このようなM個の重みベクトルそれぞれについての時間発展を表すモデルは、上記の式(1)によって表されるはずである。このため、以下では、M個の重みベクトルそれぞれについての時間発展を表すモデルを、以下の式(3)に示したM個の式によって表す。 Here, in the weight update based on the ensemble Kalman filter method, M weight vectors are treated as M samples. A model representing the time evolution for each of these M weight vectors should be represented by the above equation (1). Therefore, in the following, the model representing the time evolution for each of the M weight vectors is represented by the M equations shown in the following equation (3).
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 上記の式(3)において、jは、M個の式それぞれを識別するための添字である。すなわち、jは、1以上M以下の整数のうちのいずれかの整数である。このため、jは、M個の重みベクトルのそれぞれを識別する添字でもあり、M個の重み誤差ベクトルのそれぞれを識別する添字でもある。 In the above formula (3), j is a subscript for identifying each of the M formulas. That is, j is an integer of 1 or more and M or less. Therefore, j is also a subscript that identifies each of the M weight vectors and is also a subscript that identifies each of the M weight error vectors.
 また、出力データは、M個の重みベクトルに応じて算出されるため、M個存在するはずである。そして、M個の出力データそれぞれについての時間発展を表すモデルは、上記の式(2)によって表されるはずである。このため、以下では、M個の出力データそれぞれについての時間発展を表すモデルを、以下の式(4)に示したM個の式によって表す。 Also, since the output data is calculated according to the M weight vectors, there should be M. Then, the model representing the time evolution for each of the M output data should be represented by the above equation (2). Therefore, in the following, the model representing the time evolution for each of the M output data is represented by the M equations shown in the following equation (4).
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 上記の式(4)においても、jは、M個の式それぞれを識別するための添字である。すなわち、jは、M個の出力ベクトルのそれぞれを識別する添字でもあり、M個の出力誤差ベクトルのそれぞれを識別する添字でもある。 Also in the above equation (4), j is a subscript for identifying each of the M equations. That is, j is also a subscript that identifies each of the M output vectors, and is also a subscript that identifies each of the M output error vectors.
 アンサンブルカルマンフィルタ法では、上記の式(3)は、式(3)の右辺第1項を推定重みベクトルとし、式(3)の左辺を予測重みベクトルとして、以下の式(5)のように表し直される。 In the ensemble Kalman filter method, the above equation (3) is expressed as the following equation (5) with the first term on the right side of the equation (3) as the estimated weight vector and the left side of the equation (3) as the predicted weight vector. It will be fixed.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 上記の式(5)における右辺第1項は、推定重みベクトルを示す。また、式(5)における左辺は、予測重みベクトルを示す。ここで、式(5)に示すように、時刻kに対応付けられた予測重みベクトルを得るためには、時刻k-1に対応付けられた推定重みベクトルが必要である。このため、推定重みベクトルには、初期値となるベクトルが必要である。この初期値となるベクトルの各成分には、例えば、乱数により0以上1以下の値をランダムに割り当てられてもよく、他の方法により他の値が割り当てられてもよい。 The first term on the right side in the above equation (5) indicates an estimated weight vector. Further, the left side in the equation (5) indicates a predicted weight vector. Here, as shown in the equation (5), in order to obtain the predicted weight vector associated with the time k, the estimated weight vector associated with the time k-1 is required. Therefore, the estimated weight vector needs a vector as an initial value. For example, a value of 0 or more and 1 or less may be randomly assigned to each component of the vector as the initial value, or another value may be assigned by another method.
 また、アンサンブルカルマンフィルタ法では、上記の式(4)は、式(4)の右辺第1項を推定出力ベクトルとし、式(4)の左辺を予測出力ベクトルとして、以下の式(6)のように表し直される。 Further, in the ensemble Kalman filter method, in the above equation (4), the first term on the right side of the equation (4) is used as the estimated output vector, and the left side of the equation (4) is used as the predicted output vector, as in the following equation (6). It is re-expressed as.
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 ここで、上記の式(6)における右辺第1項は、推定出力ベクトルを示す。すなわち、アンサンブルカルマンフィルタ法では、推定出力ベクトルは、予測重みベクトルと時刻とを変数とする第2活性化関数によって表される。また、式(6)における左辺は、予測出力ベクトルを示す。 Here, the first term on the right side in the above equation (6) indicates an estimated output vector. That is, in the ensemble Kalman filter method, the estimated output vector is represented by the second activation function having the predicted weight vector and the time as variables. The left side in Eq. (6) indicates the predicted output vector.
 更に、アンサンブルカルマンフィルタ法では、上記の式(5)に基づいて算出される推定重みベクトルについての誤差アンサンブルベクトルが、以下の式(7)及び式(8)のように表される。なお、以下では、説明の便宜上、当該誤差アンサンブルベクトルを、重み誤差アンサンブルベクトルと称して説明する。 Further, in the ensemble Kalman filter method, the error ensemble vector for the estimated weight vector calculated based on the above equation (5) is expressed as the following equations (7) and (8). In the following, for convenience of explanation, the error ensemble vector will be referred to as a weight error ensemble vector.
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000008
 上記の式(7)の左辺は、重み誤差アンサンブルベクトルを示す。また、式(7)の右辺は、重み誤差アンサンブルベクトルの各成分を示す。式(7)の右辺に示したように、重み誤差アンサンブルベクトルは、横ベクトルとして定義される。すなわち、重み誤差アンサンブルベクトルの転置行列は、縦ベクトルである。そして、重み誤差アンサンブルベクトルの各成分は、式(8)によって算出される。すなわち、重み誤差アンサンブルベクトルの各成分は、各推定重みベクトルと、M個の推定重みベクトルの平均との差である。 The left side of the above equation (7) shows the weight error ensemble vector. The right side of the equation (7) shows each component of the weight error ensemble vector. As shown on the right side of equation (7), the weight error ensemble vector is defined as a horizontal vector. That is, the transposed matrix of the weight error ensemble vector is a vertical vector. Then, each component of the weight error ensemble vector is calculated by the equation (8). That is, each component of the weight error ensemble vector is the difference between each estimated weight vector and the average of M estimated weight vectors.
 また、アンサンブルカルマンフィルタ法では、上記の式(6)に基づいて算出される推定出力ベクトルについての誤差アンサンブルベクトルが、以下の式(9)及び式(10)のように表される。なお、以下では、説明の便宜上、当該誤差アンサンブルベクトルを、出力誤差アンサンブルベクトルと称して説明する。 Further, in the ensemble Kalman filter method, the error ensemble vector for the estimated output vector calculated based on the above equation (6) is expressed as the following equations (9) and (10). In the following, for convenience of explanation, the error ensemble vector will be referred to as an output error ensemble vector.
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000010
Figure JPOXMLDOC01-appb-M000010
 上記の式(9)の左辺は、出力誤差アンサンブルベクトルを示す。また、式(9)の右辺は、出力誤差アンサンブルベクトルの各成分を示す。式(9)の右辺に示したように、出力誤差アンサンブルベクトルは、横ベクトルとして定義される。すなわち、出力誤差アンサンブルベクトルの転置行列は、縦ベクトルである。そして、出力誤差アンサンブルベクトルの各成分は、式(10)によって算出される。すなわち、出力誤差アンサンブルベクトルの各成分は、各推定出力ベクトルと、M個の推定出力ベクトルの平均との差である。 The left side of the above equation (9) shows the output error ensemble vector. The right side of the equation (9) shows each component of the output error ensemble vector. As shown on the right side of equation (9), the output error ensemble vector is defined as a horizontal vector. That is, the transposed matrix of the output error ensemble vector is a vertical vector. Then, each component of the output error ensemble vector is calculated by the equation (10). That is, each component of the output error ensemble vector is the difference between each estimated output vector and the average of M estimated output vectors.
 そして、アンサンブルカルマンフィルタ法では、上記の式(7)~(10)に基づいて算出された重み誤差アンサンブルベクトル及び出力誤差アンサンブルベクトルにより、カルマンゲイン行列を算出するために用いる2つの共分散行列が、以下の式(11)及び式(12)のように表される。 Then, in the ensemble Kalman filter method, the two covariance matrices used for calculating the Kalman gain matrix based on the weight error ensemble vector and the output error ensemble vector calculated based on the above equations (7) to (10) It is expressed as the following equations (11) and (12).
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000012
 以下では、説明の便宜上、上記の式(11)に示した共分散行列を、第1共分散行列と称して説明する。第1共分散行列は、更新対象重みの数がLであり、出力データの次元がRであるため、L行R列の行列である。 In the following, for convenience of explanation, the covariance matrix shown in the above equation (11) will be referred to as a first covariance matrix. The first covariance matrix is a matrix of L rows and R columns because the number of weights to be updated is L and the dimension of the output data is R.
 また、以下では、説明の便宜上、上記の式(12)に示した共分散行列を、第2共分散行列と称して説明する。第2共分散行列は、出力データの次元がRであるため、R行R列の行列である。 In the following, for convenience of explanation, the covariance matrix shown in the above equation (12) will be referred to as a second covariance matrix. The second covariance matrix is a matrix of R rows and R columns because the dimension of the output data is R.
 アンサンブルカルマンフィルタ法では、カルマンゲイン行列は、上記の式(11)基づいて算出された第1共分散行列と、式(12)に基づいて算出された第2共分散行列とに基づいて、以下の式(13)のように表される。 In the ensemble Kalman filter method, the Kalman gain matrix is based on the first covariance matrix calculated based on the above equation (11) and the second covariance matrix calculated based on the equation (12) as follows. It is expressed as the equation (13).
Figure JPOXMLDOC01-appb-M000013
Figure JPOXMLDOC01-appb-M000013
 上記の式(13)の左辺は、カルマンゲイン行列を示す。ここで、前述した通り、第1共分散行列がL行R列の行列であり、第2共分散行列がR行R列の行列である。このため、カルマンゲイン行列は、L行R列の行列である。 The left side of the above equation (13) shows the Kalman gain matrix. Here, as described above, the first covariance matrix is a matrix of L rows and R columns, and the second covariance matrix is a matrix of R rows and R columns. Therefore, the Kalman gain matrix is a matrix of L rows and R columns.
 アンサンブルカルマンフィルタ法では、上記の式(13)に基づくカルマンゲイン行列に基づいて、以下の式(14)に示すように、予測重みベクトルを補正することにより、推定重みベクトルを算出することができる。 In the ensemble Kalman filter method, the estimated weight vector can be calculated by correcting the predicted weight vector as shown in the following equation (14) based on the Kalman gain matrix based on the above equation (13).
Figure JPOXMLDOC01-appb-M000014
Figure JPOXMLDOC01-appb-M000014
 上記の式(14)の右辺第2項における括弧内の第1項は、出力データについての教師データを示す。 The first term in parentheses in the second term on the right side of the above equation (14) indicates the teacher data for the output data.
 式(14)に基づいて推定重みベクトルが算出されると、アンサンブルカルマンフィルタ法では、以下の式(15)に基づいて、更新後の更新対象重みが算出される。 When the estimated weight vector is calculated based on the equation (14), the updated Kalman filter method calculates the updated weight to be updated based on the following equation (15).
Figure JPOXMLDOC01-appb-M000015
Figure JPOXMLDOC01-appb-M000015
 上記の式(15)の左辺は、前述の更新後の更新対象重みを示す。すなわち、更新後の更新対象重みは、推定重みベクトルの平均である。 The left side of the above equation (15) indicates the update target weight after the above update. That is, the update target weight after update is the average of the estimated weight vectors.
 機械学習装置1は、上記の式(15)により更新後の更新対象重みを算出した後、算出した更新後の更新対象重みを用いて、前述の出力データ生成処理を行う。そして、機械学習装置1は、次に入力データが入力層L1に受け付けられると、上記の式(14)に基づいて算出されたM個の推定重みベクトルを、上記の式(5)に対する入力として用いることにより、次の重み更新処理を開始する。このように、機械学習装置1は、入力データが入力層L1に受け付けられる毎に、重み更新処理と、出力データ生成処理とを行う。 The machine learning device 1 calculates the updated update target weight by the above equation (15), and then performs the above-mentioned output data generation process using the calculated update target weight. Then, when the input data is next received by the input layer L1, the machine learning device 1 uses M estimated weight vectors calculated based on the above equation (14) as inputs to the above equation (5). By using it, the next weight update process is started. As described above, the machine learning device 1 performs the weight update process and the output data generation process each time the input data is received by the input layer L1.
 ここで、アンサンブルFORCE学習では、例えば、出力ノードの数を1とした場合、第2共分散行列は、1行1列の行列、すなわち、スカラーとなる。その結果、アンサンブルFORCE学習では、当該場合、カルマンゲイン行列は、L行1列の行列、すなわち、L次元のベクトルとなる。このことから、アンサンブルFORCE学習による機械学習を行う機械学習装置1は、出力ノードの数を1とすることにより、カルマンゲイン行列を算出するための計算コストを大幅に低下させることができる。なお、アンサンブルFORCE学習は、出力ノードの数を2以上とした場合であっても、第2共分散行列が高々s行s列の行列であるため、逆行列の計算を伴う他のニューラルネットワーク(例えば、拡張カルマンフィルタ法を用いるニューラルネットワーク等)と比べて、カルマンゲイン行列を算出するための計算コストを低下させることができる。その結果、機械学習装置1は、量子化誤差に起因する数値的不安定性が生じてしまうことを抑制することができる。 Here, in the ensemble FORCE learning, for example, when the number of output nodes is 1, the second covariance matrix is a 1-by-1 matrix, that is, a scalar. As a result, in the ensemble FORCE learning, in this case, the Kalman gain matrix becomes an L-by-1 matrix, that is, an L-dimensional vector. From this, the machine learning device 1 that performs machine learning by ensemble FORCE learning can significantly reduce the calculation cost for calculating the Kalman gain matrix by setting the number of output nodes to 1. In the ensemble FORCE learning, even when the number of output nodes is 2 or more, since the second covariance matrix is a matrix of s rows and s columns at most, another neural network that involves the calculation of the inverse matrix ( For example, the calculation cost for calculating the Kalman gain matrix can be reduced as compared with (for example, a neural network using the extended Kalman filter method). As a result, the machine learning device 1 can suppress the occurrence of numerical instability due to the quantization error.
 また、一般論として、ニューラルネットワークにおいてアンサンブルカルマンフィルタ法による重みの更新を行うことは、サンプル数と同じ数の中間層を用意する必要がある。このため、アンサンブルカルマンフィルタ法による重みの更新をニューラルネットワークにおいて用いることは、計算コストを低減する観点からは、好ましいことではなかった。しかしながら、アンサンブルFORCE学習のように、リザボア等の層内において重みの更新が行われない中間層(この一例において、中間層L2)を有するニューラルネットワークでは、中間層を1つ用意すれば十分である。このため、アンサンブルFORCE学習は、計算コストの増大を抑制しつつ、量子化誤差に起因する数値的不安定性が生じてしまうことを抑制することができる。換言すると、アンサンブルFORCE学習器は、リザボアコンピューティングを採用することによるメリットと、アンサンブルカルマンフィルタ法による重みの更新を行うことによるメリットとを両立させることができるニューラルネットワークと言える。 Also, as a general theory, updating weights by the ensemble Kalman filter method in a neural network requires preparing the same number of intermediate layers as the number of samples. Therefore, using the weight update by the ensemble Kalman filter method in the neural network is not preferable from the viewpoint of reducing the calculation cost. However, in a neural network having an intermediate layer (intermediate layer L2 in this example) in which weights are not updated in a layer such as a reservoir, as in ensemble FORCE learning, it is sufficient to prepare one intermediate layer. .. Therefore, the ensemble FORCE learning can suppress the increase in calculation cost and the occurrence of numerical instability due to the quantization error. In other words, the ensemble FORCE learner can be said to be a neural network that can achieve both the merits of adopting the reservoir computing and the merits of updating the weights by the ensemble Kalman filter method.
 ここで、図3を参照し、機械学習装置1が行う重み更新処理の流れについて説明する。図3は、機械学習装置1が行う重み更新処理の流れの一例を示す図である。機械学習装置1は、入力層L1がP次元の入力データを時系列順に受け付ける毎に、図3に示したフローチャートの処理を行う。以下では、一例として、図3に示したステップS110の処理が行われるよりも前のタイミングにおいて、時系列順における最初のP次元の入力データを機械学習装置1が受け付けている場合について説明する。 Here, with reference to FIG. 3, the flow of the weight update process performed by the machine learning device 1 will be described. FIG. 3 is a diagram showing an example of the flow of the weight update process performed by the machine learning device 1. The machine learning device 1 processes the flowchart shown in FIG. 3 every time the input layer L1 receives P-dimensional input data in chronological order. In the following, as an example, a case where the machine learning device 1 receives the first P-dimensional input data in the time series order at the timing before the processing of step S110 shown in FIG. 3 is performed will be described.
 機械学習装置1は、M個の推定重みベクトルそれぞれの初期値を特定する(ステップS110)。なお、機械学習装置1は、当該初期値を乱数によって算出する構成であってもよく、ユーザから与えられる構成であってもよく、他の方法によって特定する構成であってもよい。 The machine learning device 1 specifies the initial values of each of the M estimated weight vectors (step S110). The machine learning device 1 may have a configuration in which the initial value is calculated by a random number, a configuration given by the user, or a configuration specified by another method.
 次に、機械学習装置1は、ステップS110において特定したM個の初期値と、上記の式(5)と、M個の重み誤差ベクトルとに基づいて、M個の予測重みベクトルを算出する(ステップS120)。なお、機械学習装置1は、M個の重み誤差ベクトルを乱数によって算出する構成であってもよく、ユーザから与えられる構成であってもよく、他の方法によって特定する構成であってもよい。 Next, the machine learning device 1 calculates M predicted weight vectors based on the M initial values specified in step S110, the above equation (5), and the M weight error vectors (). Step S120). The machine learning device 1 may have a configuration in which M weight error vectors are calculated by random numbers, a configuration given by a user, or a configuration specified by another method.
 次に、機械学習装置1は、ステップS120において算出したM個の予測重みベクトルと、上記の式(6)と、M個の出力誤差ベクトルとに基づいて、M個の予測出力ベクトルを算出する(ステップS130)。なお、機械学習装置1は、M個の出力誤差ベクトルを乱数によって算出する構成であってもよく、ユーザから与えられる構成であってもよく、他の方法によって特定する構成であってもよい。 Next, the machine learning device 1 calculates M predicted output vectors based on the M predicted weight vectors calculated in step S120, the above equation (6), and M output error vectors. (Step S130). The machine learning device 1 may have a configuration in which M output error vectors are calculated by random numbers, a configuration given by the user, or a configuration specified by another method.
 次に、機械学習装置1は、ステップS120において算出したM個の予測重みベクトルと、ステップS130において算出したM個の予測出力ベクトルとに基づいて、2つの誤差アンサンブルベクトルを算出する(ステップS140)。より具体的には、機械学習装置1は、ステップS120において算出したM個の予測重みベクトルと、上記の式(7)及び式(8)とに基づいて、重み誤差アンサンブルベクトルを算出する。また、機械学習装置1は、ステップS130において算出したM個の予測出力ベクトルと、上記の式(9)及び式(10)とに基づいて、出力誤差アンサンブルベクトルを算出する。 Next, the machine learning device 1 calculates two error ensemble vectors based on the M predicted weight vectors calculated in step S120 and the M predicted output vectors calculated in step S130 (step S140). .. More specifically, the machine learning device 1 calculates the weight error ensemble vector based on the M predicted weight vectors calculated in step S120 and the above equations (7) and (8). Further, the machine learning device 1 calculates an output error ensemble vector based on the M predicted output vectors calculated in step S130 and the above equations (9) and (10).
 次に、機械学習装置1は、ステップS140において算出した2つの誤差アンサンブルベクトルに基づいて、2つの共分散行列を算出する(ステップS150)。より具体的には、機械学習装置1は、ステップS140において算出した重み誤差アンサンブルベクトルと、ステップS140において算出した出力誤差アンサンブルベクトルと、上記の式(11)とに基づいて、第1共分散行列を算出する。また、機械学習装置1は、ステップS140において算出した出力誤差アンサンブルベクトルと、上記の式(12)とに基づいて、第2共分散行列を算出する。 Next, the machine learning device 1 calculates two covariance matrices based on the two error ensemble vectors calculated in step S140 (step S150). More specifically, the machine learning device 1 has a first covariance matrix based on the weight error ensemble vector calculated in step S140, the output error ensemble vector calculated in step S140, and the above equation (11). Is calculated. Further, the machine learning device 1 calculates the second covariance matrix based on the output error ensemble vector calculated in step S140 and the above equation (12).
 次に、機械学習装置1は、ステップS150において算出した第1共分散行列と、ステップS150において算出した第2共分散行列と、上記の式(13)とに基づいて、カルマンゲイン行列を算出する(ステップS160)。 Next, the machine learning device 1 calculates the Kalman gain matrix based on the first covariance matrix calculated in step S150, the second covariance matrix calculated in step S150, and the above equation (13). (Step S160).
 次に、機械学習装置1は、ステップS120において算出したM個の予測重みベクトルと、ステップS130において算出したM個の予測出力ベクトルと、教師データと、ステップS160において算出したカルマンゲイン行列と、上記の式(14)及び式(15)とに基づいて、更新後の更新対象重みを算出する(ステップS180)。 Next, the machine learning device 1 includes the M predicted weight vectors calculated in step S120, the M predicted output vectors calculated in step S130, the teacher data, the Kalman gain matrix calculated in step S160, and the above. The update target weight after the update is calculated based on the equation (14) and the equation (15) of (step S180).
 次に、機械学習装置1は、次の入力データが入力層L1に受け付けられるまで待機する(ステップS190)。 Next, the machine learning device 1 waits until the next input data is received by the input layer L1 (step S190).
 機械学習装置1は、次の入力データが入力層L1に受け付けられたと判定した場合(ステップS190-YES)、ステップS120に遷移し、直前に実行したステップS170において算出したM個の推定重みベクトルに基づいて、M個の予測重みベクトルを算出する。 When the machine learning device 1 determines that the next input data has been received by the input layer L1 (step S190-YES), the machine learning device 1 transitions to step S120 and converts the M estimated weight vectors calculated in step S170 executed immediately before. Based on this, M predicted weight vectors are calculated.
 以上のようなフローチャートの処理により、機械学習装置1は、アンサンブルカルマンフィルタ法に基づく重み更新処理を行う。これにより、機械学習装置1は、カルマンゲイン行列を算出するための計算コストを大幅に低下させることができる。その結果、機械学習装置1は、量子化誤差に起因する数値的不安定性が生じてしまうことを抑制することができる。また、機械学習装置1は、当該重み更新処理によって、図2に示したアンサンブルFORCE学習によるオンライン学習を行うことができる。その結果、機械学習装置1は、例えば、当該アンサンブルFORCE学習による機械学習を行う装置として、エッジデバイスに搭載することができる。アンサンブルFORCE学習をエッジデバイス等への搭載を考える場合、当該重み更新処理の効率化が重要となる。このため、当該重み更新処理においては、効率的なデータフローを実現することが求められる。特に、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアとしてアンサンブルFORCE学習をエッジデバイス等に実装する場合、効率的なデータフローの実現は、メモリアクセスの速度、計算速度等の高速化に繋がるため、非常に重要である。そこで、以下では、当該重み更新処理における効率的なデータフローについて説明する。 By processing the flowchart as described above, the machine learning device 1 performs the weight update process based on the ensemble Kalman filter method. As a result, the machine learning device 1 can significantly reduce the calculation cost for calculating the Kalman gain matrix. As a result, the machine learning device 1 can suppress the occurrence of numerical instability due to the quantization error. Further, the machine learning device 1 can perform online learning by the ensemble FORCE learning shown in FIG. 2 by the weight updating process. As a result, the machine learning device 1 can be mounted on the edge device as, for example, a device that performs machine learning by the ensemble FORCE learning. When considering mounting the ensemble FORCE learning on an edge device or the like, it is important to improve the efficiency of the weight update process. Therefore, in the weight update process, it is required to realize an efficient data flow. In particular, when ensemble FORCE learning is implemented in an edge device or the like as hardware including at least one of near memory and memory logic, realization of efficient data flow leads to speeding up of memory access speed, calculation speed, etc. , Very important. Therefore, the efficient data flow in the weight update process will be described below.
 <重み更新処理におけるデータフロー>
 前述した通り、アンサンブルFORCE学習器は、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアとしてエッジデバイス等に実装することができる。そして、当該ハードウェアとしてエッジデバイス等に実装されるアンサンブルFORCE学習器のメモリアクセスの速度、計算速度等は、アンサンブルカルマンフィルタ法に基づく重み更新処理におけるデータフローの設計に応じて異なる。このような事情から、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアとしてアンサンブルFORCE学習器をエッジデバイス等に実装にする場合、効率的な当該データフローを考える必要がある。
<Data flow in weight update processing>
As described above, the ensemble FORCE learner can be implemented in an edge device or the like as hardware including at least one of near memory and memory logic. The memory access speed, calculation speed, and the like of the ensemble FORCE learner mounted on the edge device or the like as the hardware differ depending on the design of the data flow in the weight update process based on the ensemble Kalman filter method. Under these circumstances, when the ensemble FORCE learner is implemented in an edge device or the like as hardware including at least one of near memory and memory logic, it is necessary to consider an efficient data flow.
 そこで、以下では、アンサンブルカルマンフィルタ法に基づく重み更新処理におけるデータフローとして効率的であると考えられる具体例について説明する。 Therefore, in the following, a specific example considered to be efficient as a data flow in the weight update process based on the ensemble Kalman filter method will be described.
 図4は、アンサンブルカルマンフィルタ法に基づく重み更新処理におけるデータフローの全体構成の一例を示す図である。当該重み更新処理におけるデータフローは、図4に示したように、大きく分けてブロックB1と、ブロックB2と、ブロックB3との3つのブロックから構成される。なお、これらの3つのブロックのそれぞれは、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアを示す。なお、図4において、データフローにおける時系列順は、時刻kによって示している。 FIG. 4 is a diagram showing an example of the overall configuration of the data flow in the weight update process based on the ensemble Kalman filter method. As shown in FIG. 4, the data flow in the weight update process is roughly divided into three blocks, block B1, block B2, and block B3. Note that each of these three blocks indicates hardware that includes at least one of near memory and memory logic. In FIG. 4, the time series order in the data flow is shown by the time k.
 ブロックB1は、M個の予測重みベクトルを算出するブロックである。ブロックB1は、M個の推定重みベクトルのそれぞれに対応付けられたブロックを含む。より具体的には、ブロックB1は、M個の推定重みベクトルのうちj番目の推定重みベクトルが入力されるブロックとして、ブロックB1-jを含む。すなわち、ブロックB1は、ブロックB1-1~ブロックB1-MのM個のブロックを含む。 Block B1 is a block for calculating M predicted weight vectors. Block B1 includes a block associated with each of the M estimated weight vectors. More specifically, the block B1 includes the block B1-j as a block in which the j-th estimated weight vector among the M estimated weight vectors is input. That is, block B1 includes M blocks of blocks B1-1 to B1-M.
 ここで、ブロックB1-jには、図4に示したように、M個の推定重みベクトルのうちj番目の推定重みベクトルと、M個の重み誤差ベクトルのうちj番目の重み誤差ベクトルとが入力される。そして、ブロックB1-jは、当該推定重みベクトルと、当該重み誤差ベクトルと、上記の式(5)とに基づいて、M個の予測重みベクトルのうちj番目の予測重みベクトルを算出する。ブロックB1-jは、算出した当該予測重みベクトルをブロックB2に出力する。 Here, as shown in FIG. 4, the block B1-j contains the j-th estimated weight vector among the M estimated weight vectors and the j-th weight error vector among the M weight error vectors. Entered. Then, the block B1-j calculates the j-th predicted weight vector among the M predicted weight vectors based on the estimated weight vector, the weight error vector, and the above equation (5). Block B1-j outputs the calculated predicted weight vector to block B2.
 ブロックB2は、M個の推定重みベクトルを算出するブロックである。ブロックB2は、M個の予測重みベクトルのそれぞれに対応付けられたブロックを含む。より具体的には、ブロックB2は、M個の予測重みベクトルのうちj番目の予測重みベクトルが入力されるブロックとして、ブロックB2-jを含む。すなわち、ブロックB2は、ブロックB2-1~ブロックB2-MのM個のブロックを含む。 Block B2 is a block for calculating M estimated weight vectors. Block B2 includes a block associated with each of the M predicted weight vectors. More specifically, the block B2 includes the block B2-j as a block in which the j-th predicted weight vector among the M predicted weight vectors is input. That is, block B2 includes M blocks of blocks B2-1 to B2-M.
 ここで、ブロックB2-jには、図4に示したように、M個の予測重みベクトルのうちj番目の予測重みベクトルと、M個の差分ベクトルのうちj番目の差分ベクトルと、後述するブロックB3から出力されるカルマンゲイン行列とが入力される。ここで、M個の差分ベクトルのうちj番目の差分ベクトルは、以下の式(16)により算出される。 Here, in the block B2-j, as shown in FIG. 4, the j-th predicted weight vector among the M predicted weight vectors and the j-th difference vector among the M difference vectors will be described later. The Kalman gain matrix output from block B3 is input. Here, the j-th difference vector among the M difference vectors is calculated by the following equation (16).
Figure JPOXMLDOC01-appb-M000016
Figure JPOXMLDOC01-appb-M000016
 上記の式(16)の左辺が、M個の差分ベクトルのうちj番目の差分ベクトルを示す。すなわち、M個の差分ベクトルのうちj番目の差分ベクトルは、式(14)の右辺第2項の括弧内を新たに1つのベクトルとして定義し直したベクトルである。 The left side of the above equation (16) indicates the j-th difference vector among the M difference vectors. That is, the j-th difference vector among the M difference vectors is a vector in which the inside of the parentheses of the second term on the right side of the equation (14) is newly redefined as one vector.
 ブロックB2-jは、M個の予測重みベクトルのうちj番目の予測重みベクトルと、M個の差分ベクトルのうちj番目の差分ベクトルと、カルマンゲイン行列と、上記の式(14)とに基づいて、M個の推定重みベクトルのうちj番目の推定重みベクトルを算出する。ブロックB2-jは、算出した当該推定重みベクトルを出力する。これにより、機械学習装置1は、図4において図示しない他のブロックにより、ブロックB2において算出されたM個の推定重みベクトルと、上記の式(15)とに基づいて、更新後の更新対象重みを算出することができる。 Block B2-j is based on the j-th predicted weight vector among the M predicted weight vectors, the j-th difference vector among the M difference vectors, the Kalman gain matrix, and the above equation (14). Then, the j-th estimated weight vector out of the M estimated weight vectors is calculated. Block B2-j outputs the calculated estimated weight vector. As a result, the machine learning device 1 is updated based on the M estimated weight vectors calculated in the block B2 by other blocks (not shown in FIG. 4) and the above equation (15). Can be calculated.
 ブロックB3は、カルマンゲイン行列を算出するブロックである。ブロックB3には、ブロックB1から出力されるM個の予測重みベクトルが入力される。そして、ブロックB3は、当該M個の予測重みベクトルに基づいて、カルマンゲイン行列を算出する。ブロックB3は、算出したカルマンゲイン行列を出力する。この際、ブロックB3は、ブロックB2にもカルマンゲイン行列を出力する。 Block B3 is a block for calculating the Kalman gain matrix. M predictive weight vectors output from the block B1 are input to the block B3. Then, the block B3 calculates the Kalman gain matrix based on the M predicted weight vectors. Block B3 outputs the calculated Kalman gain matrix. At this time, the block B3 also outputs the Kalman gain matrix to the block B2.
 ここで、図5は、ブロックB3内部のデータフローの最も単純な具体例を示す図である。図3に示したデータフローは、アンサンブルFORCE学習において採用される第2活性化関数が、如何なる非線形関数であっても成立するデータフローである。図5に示したデータフローは、大きく分けてブロックB31~ブロックB35の5つのブロックから構成される。なお、当該5つのブロックのそれぞれは、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアを示す。なお、図5において、データフローにおける時系列順は、時刻kによって示している。 Here, FIG. 5 is a diagram showing the simplest concrete example of the data flow inside the block B3. The data flow shown in FIG. 3 is a data flow that holds regardless of the second activation function adopted in the ensemble FORCE learning, which is a non-linear function. The data flow shown in FIG. 5 is roughly divided into five blocks, blocks B31 to B35. It should be noted that each of the five blocks indicates hardware including at least one of near memory and memory logic. In FIG. 5, the time series order in the data flow is shown by the time k.
 ブロックB31は、上記の式(8)の右辺第1項と、上記の式(10)の右辺第1項とを算出するブロックである。上記の式(8)の右辺第1項は、すなわち、M個の予測重みベクトルの平均である。また、上記の式(10)の右辺第1項は、すなわち、M個の予測出力ベクトルの平均である。すなわち、ブロックB31には、M個の予測重みベクトルと、M個の予測出力ベクトルとが入力される。そして、ブロックB31は、M個の予測重みベクトルの平均と、M個の予測出力ベクトルの平均とを算出する。ブロックB31は、算出した当該M個の予測重みベクトルの平均と、算出した当該M個の予測出力ベクトルの平均とを、ブロックB32に出力する。この際、ブロックB31は更に、入力されたM個の予測重みベクトルと、入力されたM個の予測出力ベクトルとを、ブロックB32に出力する。 Block B31 is a block for calculating the first term on the right side of the above equation (8) and the first term on the right side of the above equation (10). The first term on the right side of the above equation (8) is, that is, the average of M predicted weight vectors. Further, the first term on the right side of the above equation (10) is, that is, the average of M predicted output vectors. That is, M prediction weight vectors and M prediction output vectors are input to the block B31. Then, the block B31 calculates the average of M prediction weight vectors and the average of M prediction output vectors. The block B31 outputs the calculated average of the M predicted weight vectors and the calculated average of the M predicted output vectors to the block B32. At this time, the block B31 further outputs the input M prediction weight vectors and the input M prediction output vectors to the block B32.
 ブロックB32は、重み誤差アンサンブルベクトルの各成分と、出力誤差アンサンブルベクトルの各成分とを算出するブロックである。すなわち、ブロックB32には、ブロックB31から出力されるM個の予測重みベクトルの平均と、ブロックB31から出力されるM個の予測出力ベクトルの平均とが入力される。また、ブロックB32には、ブロックB31から出力されるM個の予測重みベクトルと、ブロックB31から出力されるM個の予測出力ベクトルとが入力される。そして、ブロックB32は、当該M個の予測重みベクトルと、当該M個の予測重みベクトルの平均とに基づいて、重み誤差アンサンブルベクトルの各成分を算出する。また、ブロックB32は、当該M個の予測出力ベクトルと、当該M個の予測出力ベクトルの平均とに基づいて、出力誤差アンサンブルベクトルの各成分を算出する。ブロックB32は、算出した重み誤差アンサンブルベクトルの各成分と、算出した出力誤差アンサンブルベクトルの各成分とを、ブロックB33に出力する。 Block B32 is a block for calculating each component of the weight error ensemble vector and each component of the output error ensemble vector. That is, the average of the M predicted weight vectors output from the block B31 and the average of the M predicted output vectors output from the block B31 are input to the block B32. Further, in the block B32, M predictive weight vectors output from the block B31 and M predictive output vectors output from the block B31 are input. Then, the block B32 calculates each component of the weight error ensemble vector based on the M predicted weight vectors and the average of the M predicted weight vectors. Further, the block B32 calculates each component of the output error ensemble vector based on the average of the M predicted output vectors and the M predicted output vectors. The block B32 outputs each component of the calculated weight error ensemble vector and each component of the calculated output error ensemble vector to the block B33.
 ブロックB33は、重み誤差アンサンブルベクトルと、出力誤差アンサンブルベクトルとを生成するブロックである。すなわち、ブロックB33には、ブロックB32から出力される重み誤差アンサンブルベクトルの各成分と、ブロックB32から出力される出力誤差アンサンブルベクトルの各成分とが入力される。そして、ブロックB33は、ブロックB32から出力される重み誤差アンサンブルベクトルの各成分に基づいて、重み誤差アンサンブルベクトルを生成する。また、ブロックB32は、出力誤差アンサンブルベクトルの各成分に基づいて、出力誤差アンサンブルベクトルを生成する。ブロックB33は、生成した重み誤差アンサンブルベクトルと、生成した出力誤差アンサンブルベクトルとを、ブロックB34に出力する。 Block B33 is a block that generates a weight error ensemble vector and an output error ensemble vector. That is, each component of the weight error ensemble vector output from the block B32 and each component of the output error ensemble vector output from the block B32 are input to the block B33. Then, the block B33 generates a weight error ensemble vector based on each component of the weight error ensemble vector output from the block B32. Further, the block B32 generates an output error ensemble vector based on each component of the output error ensemble vector. The block B33 outputs the generated weight error ensemble vector and the generated output error ensemble vector to the block B34.
 ブロックB34は、第1共分散行列と、第2共分散行列とを算出するブロックである。換言すると、ブロックB35は、上記の式(11)及び式(12)の計算を行うブロックである。すなわち、ブロックB34には、ブロックB33から出力される重み誤差アンサンブルベクトルと、ブロックB33から出力される出力誤差アンサンブルベクトルとが入力される。そして、ブロックB34は、当該重み誤差アンサンブルベクトルと、当該出力誤差アンサンブルベクトルとに基づいて、第1共分散行列を算出する。また、ブロックB34は、当該出力誤差アンサンブルベクトルに基づいて、第2共分散行列を算出する。ブロックB34は、算出した第1共分散行列と、算出した第2共分散行列とを、ブロックB35に出力する。 Block B34 is a block for calculating the first covariance matrix and the second covariance matrix. In other words, the block B35 is a block that performs the calculation of the above equations (11) and (12). That is, the weight error ensemble vector output from the block B33 and the output error ensemble vector output from the block B33 are input to the block B34. Then, the block B34 calculates the first covariance matrix based on the weight error ensemble vector and the output error ensemble vector. Further, the block B34 calculates the second covariance matrix based on the output error ensemble vector. The block B34 outputs the calculated first covariance matrix and the calculated second covariance matrix to the block B35.
 ブロックB35は、カルマンゲイン行列を算出するブロックである。換言すると、ブロックB35は、上記の式(13)の計算を行うブロックである。すなわち、ブロックB35には、ブロックB34から出力される第1共分散行列と、ブロックB34から出力される第2共分散行列とが入力される。そして、ブロックB35は、当該第1共分散行列と、当該第2共分散行列とに基づいて、カルマンゲイン行列を算出する。ブロックB35は、算出したカルマンゲイン行列を出力する。 Block B35 is a block for calculating the Kalman gain matrix. In other words, the block B35 is a block that performs the calculation of the above equation (13). That is, the first covariance matrix output from the block B34 and the second covariance matrix output from the block B34 are input to the block B35. Then, the block B35 calculates the Kalman gain matrix based on the first covariance matrix and the second covariance matrix. Block B35 outputs the calculated Kalman gain matrix.
 以上のようなデータフローに基づいて、機械学習装置1は、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアとしてアンサンブルFORCE学習をエッジデバイス等に実装することができる。その結果、機械学習装置1は、第2活性化関数として特殊な関数を用いることなく、メモリアクセスの速度、計算速度等の高速化を図ることができる。 Based on the above data flow, the machine learning device 1 can implement ensemble FORCE learning on an edge device or the like as hardware including at least one of near memory and memory logic. As a result, the machine learning device 1 can increase the speed of memory access, the calculation speed, and the like without using a special function as the second activation function.
 <機械学習装置による機械学習の結果>
 以下、機械学習装置1による機械学習の結果について説明する。
<Results of machine learning by machine learning device>
Hereinafter, the result of machine learning by the machine learning device 1 will be described.
 以下では、機械学習装置1による機械学習の結果について、図6に示した二重振り子の変位の時間的変化を機械学習装置1に機械学習させた結果を例に挙げて説明する。図6は、原点から長さl1の棒によって繋がれた質量m1の第1錘と、当該錘と長さl2の棒によって繋がれた質量m2の第2錘とによって構成される二重振り子の一例を示す図である。図6に示した二重振り子における第1錘と第2錘とのX軸方向及びY軸方向それぞれの変位の時間的変化は、運動方程式によって決定論的に記述される。なお、図6において、重力が働く方向は、矢印gによって示される方向である。 In the following, the result of machine learning by the machine learning device 1 will be described by taking as an example the result of having the machine learning device 1 machine learn the temporal change of the displacement of the double pendulum shown in FIG. FIG. 6 shows a double pendulum composed of a first weight having a mass m1 connected by a rod having a length l1 from the origin and a second weight having a mass m2 connected to the weight by a rod having a length l2. It is a figure which shows an example. The temporal changes in the displacements of the first weight and the second weight in the X-axis direction and the Y-axis direction in the double pendulum shown in FIG. 6 are deterministically described by the equation of motion. In FIG. 6, the direction in which gravity acts is the direction indicated by the arrow g.
 図6に示した二重振り子における運動方程式は、第1錘及び第2錘それぞれについて書き下される。その際、第1錘及び第2錘のそれぞれについて書き下された運動方程式における力は、図6に示したY軸と棒l1との間の角度θ1と、当該Y軸と棒l2との間の角度θ2と、角度θ1の単位時間あたりの変化である角速度と、角度θ2の単位時間あたりの変化である角速度との4つのパラメータの関数によって示される。 The equation of motion for the double pendulum shown in FIG. 6 is written down for each of the first weight and the second weight. At that time, the force in the equation of motion written for each of the first weight and the second weight is between the angle θ1 between the Y-axis and the rod l1 shown in FIG. 6 and the Y-axis and the rod l2. It is shown by a function of four parameters: the angle θ2 of, the angular velocity which is the change of the angle θ1 per unit time, and the angular velocity which is the change of the angle θ2 per unit time.
 そこで、我々は、これら4つのパラメータをセンサによって時系列順に検出し、時系列順に検出した当該4つのパラメータを機械学習装置1に4次元の入力データとして入力した。その際、我々は、第1錘及び第2錘それぞれの変位の時間的変化についての教師データを機械学習装置1に予め記憶させておいた。そして、機械学習装置1に所定期間、第1錘及び第2錘それぞれの変位の時間的変化のオンライン学習を行わせた。その結果が、図7及び図8に示したグラフである。 Therefore, we detected these four parameters in chronological order by the sensor, and input the four parameters detected in chronological order into the machine learning device 1 as four-dimensional input data. At that time, we stored in advance the teacher data about the temporal change of the displacement of each of the first weight and the second weight in the machine learning device 1. Then, the machine learning device 1 was made to perform online learning of the temporal change of the displacement of each of the first weight and the second weight for a predetermined period. The result is the graph shown in FIGS. 7 and 8.
 図7は、図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させている期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの一例を示す図である。図7に示したグラフの縦軸は、第2錘のX軸方向における変位を示す。当該グラフの横軸は、経過時間を示す。なお、図7では、当該期間が、経過時間0~経過時間400の期間として示されている。 FIG. 7 shows the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the graph which plotted the temporal change. The vertical axis of the graph shown in FIG. 7 shows the displacement of the second weight in the X-axis direction. The horizontal axis of the graph shows the elapsed time. In addition, in FIG. 7, the said period is shown as a period of elapsed time 0 to elapsed time 400.
 図7に示したグラフにおけるプロットPLT1は、教師データのプロットである。また、当該グラフにおけるプロットPLT2は、出力データのプロットである。図7に示したように、オンライン学習中の機械学習装置1から出力される出力データと教師データとの一致度は、それほど高くない。これは、機械学習装置1がオンライン学習中だからである。 The plot PLT1 in the graph shown in FIG. 7 is a plot of teacher data. Further, the plot PLT2 in the graph is a plot of output data. As shown in FIG. 7, the degree of agreement between the output data output from the machine learning device 1 during online learning and the teacher data is not so high. This is because the machine learning device 1 is learning online.
 一方、図8は、図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させた後の期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの一例を示す図である。図8に示したグラフの縦軸は、第2錘のX軸方向における変位を示す。当該グラフの横軸は、経過時間を示す。なお、図8では、当該期間が、経過時間400~経過時間800の期間として示されている。 On the other hand, FIG. 8 is output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows an example of the graph which plotted the temporal change of output data. The vertical axis of the graph shown in FIG. 8 shows the displacement of the second weight in the X-axis direction. The horizontal axis of the graph shows the elapsed time. In addition, in FIG. 8, the said period is shown as a period of elapsed time 400 to elapsed time 800.
 図8に示したグラフにおけるプロットPLT1は、教師データのプロットである。また、当該グラフにおけるプロットPLT3は、出力データのプロットである。図8に示したように、オンライン学習後の機械学習装置1から出力される出力データと教師データとの一致度は、オンライン学習前と比べて、高くなっている。 The plot PLT1 in the graph shown in FIG. 8 is a plot of teacher data. Further, the plot PLT3 in the graph is a plot of output data. As shown in FIG. 8, the degree of agreement between the output data output from the machine learning device 1 after the online learning and the teacher data is higher than that before the online learning.
 ここで、図7及び図8に示した例は、中間ノードの数が500個であり、アンサンブルカルマンフィルタ法におけるサンプルの数が100個(すなわち、M=100で)である場合において機械学習装置1にオンライン学習を行わせた結果の例である。機械学習装置1が行うオンライン学習の結果は、中間ノードの数及びサンプルの数によって精度が変わる。 Here, in the example shown in FIGS. 7 and 8, the machine learning device 1 is used when the number of intermediate nodes is 500 and the number of samples in the ensemble Kalman filter method is 100 (that is, at M = 100). This is an example of the result of having a student perform online learning. The accuracy of the results of online learning performed by the machine learning device 1 varies depending on the number of intermediate nodes and the number of samples.
 図9及び図10に示した例は、中間ノードの数が250個であり、アンサンブルカルマンフィルタ法におけるサンプルの数が100個である場合において、図7及び図8に示したグラフと同様のグラフを機械学習装置1に描かせた場合の結果の例である。 In the example shown in FIGS. 9 and 10, when the number of intermediate nodes is 250 and the number of samples in the ensemble Kalman filter method is 100, a graph similar to the graphs shown in FIGS. 7 and 8 is obtained. This is an example of the result when the machine learning device 1 is drawn.
 図9は、図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させている期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの他の例を示す図である。図9に示したグラフの縦軸は、第2錘のX軸方向における変位を示す。当該グラフの横軸は、経過時間を示す。なお、図9では、当該期間が、経過時間0~経過時間400の期間として示されている。 FIG. 9 shows the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows another example of the graph which plotted the temporal change. The vertical axis of the graph shown in FIG. 9 shows the displacement of the second weight in the X-axis direction. The horizontal axis of the graph shows the elapsed time. In addition, in FIG. 9, the said period is shown as a period of elapsed time 0 to elapsed time 400.
 図9に示したグラフにおけるプロットPLT1は、教師データのプロットである。また、当該グラフにおけるプロットPLT4は、出力データのプロットである。図9に示したように、オンライン学習中の機械学習装置1から出力される出力データと教師データとの一致度は、それほど高くない。これは、機械学習装置1がオンライン学習中だからである。 The plot PLT1 in the graph shown in FIG. 9 is a plot of teacher data. Further, the plot PLT4 in the graph is a plot of output data. As shown in FIG. 9, the degree of agreement between the output data output from the machine learning device 1 during online learning and the teacher data is not so high. This is because the machine learning device 1 is learning online.
 一方、図10は、図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させた後の期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの他の例を示す図である。図10に示したグラフの縦軸は、第2錘のX軸方向における変位を示す。当該グラフの横軸は、経過時間を示す。なお、図10では、当該期間が、経過時間400~経過時間800の期間として示されている。 On the other hand, FIG. 10 is output from the machine learning device 1 in the period after the machine learning device 1 is machine-learned about the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows another example of the graph which plotted the temporal change of output data. The vertical axis of the graph shown in FIG. 10 shows the displacement of the second weight in the X-axis direction. The horizontal axis of the graph shows the elapsed time. In addition, in FIG. 10, the said period is shown as a period of elapsed time 400 to elapsed time 800.
 図10に示したグラフにおけるプロットPLT1は、教師データのプロットである。また、当該グラフにおけるプロットPLT5は、出力データのプロットである。図10に示したように、オンライン学習後の機械学習装置1から出力される出力データと教師データとの一致度は、オンライン学習前と比べて、高くなっている。また、図10に示したように、図10に示した例におけるオンライン学習中の機械学習装置1から出力される出力データと教師データとの一致度は、図8に示した例におけるオンライン学習中の機械学習装置1から出力される出力データと教師データと比べて、それほど変化していない。これは、図10に示した例における中間ノードの数が、図7に示した例における中間ノードの数の2分の1となっていても、機械学習装置1が行うオンライン学習の精度が高いことを意味する。 The plot PLT1 in the graph shown in FIG. 10 is a plot of teacher data. Further, the plot PLT5 in the graph is a plot of output data. As shown in FIG. 10, the degree of agreement between the output data output from the machine learning device 1 after the online learning and the teacher data is higher than that before the online learning. Further, as shown in FIG. 10, the degree of coincidence between the output data output from the machine learning device 1 during online learning in the example shown in FIG. 10 and the teacher data is during online learning in the example shown in FIG. Compared with the output data output from the machine learning device 1 and the teacher data, there is not much change. This is because even if the number of intermediate nodes in the example shown in FIG. 10 is half the number of intermediate nodes in the example shown in FIG. 7, the accuracy of online learning performed by the machine learning device 1 is high. Means that.
 すなわち、機械学習装置1は、アンサンブルFORCE学習とアンサンブルカルマンフィルタ法による重み更新処理とによって、中間ノードの数を少なくしつつ、オンライン学習の精度を向上させることができる。その結果、機械学習装置1は、製造コストを低減と、機械学習の精度の向上とを両立させることができる。 That is, the machine learning device 1 can improve the accuracy of online learning while reducing the number of intermediate nodes by the ensemble FORCE learning and the weight update process by the ensemble Kalman filter method. As a result, the machine learning device 1 can achieve both a reduction in manufacturing cost and an improvement in machine learning accuracy.
 また、図11及び図12に示した例は、中間ノードの数が250個であり、アンサンブルカルマンフィルタ法におけるサンプルの数が20個である場合において、図7及び図8に示したグラフと同様のグラフを機械学習装置1に描かせた場合の結果の例である。 Further, the examples shown in FIGS. 11 and 12 are the same as the graphs shown in FIGS. 7 and 8 when the number of intermediate nodes is 250 and the number of samples in the ensemble Kalman filter method is 20. This is an example of the result when the graph is drawn on the machine learning device 1.
 図11は、図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させている期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの更に他の例を示す図である。図11に示したグラフの縦軸は、第2錘のX軸方向における変位を示す。当該グラフの横軸は、経過時間を示す。なお、図11では、当該期間が、経過時間0~経過時間400の期間として示されている。 FIG. 11 shows the output data output from the machine learning device 1 during the period in which the machine learning device 1 is machine-learning the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows still another example of the graph which plotted the temporal change. The vertical axis of the graph shown in FIG. 11 shows the displacement of the second weight in the X-axis direction. The horizontal axis of the graph shows the elapsed time. In addition, in FIG. 11, the said period is shown as a period of elapsed time 0 to elapsed time 400.
 図11に示したグラフにおけるプロットPLT1は、教師データのプロットである。また、当該グラフにおけるプロットPLT6は、出力データのプロットである。図11に示したように、オンライン学習中の機械学習装置1から出力される出力データと教師データとの一致度は、それほど高くない。これは、機械学習装置1がオンライン学習中だからである。 The plot PLT1 in the graph shown in FIG. 11 is a plot of teacher data. Further, the plot PLT6 in the graph is a plot of output data. As shown in FIG. 11, the degree of agreement between the output data output from the machine learning device 1 during online learning and the teacher data is not so high. This is because the machine learning device 1 is learning online.
 一方、図12は、図6に示した二重振り子における第2錘のX軸方向の変位の時間的変化を機械学習装置1に機械学習させた後の期間において機械学習装置1から出力された出力データの時間的変化をプロットしたグラフの更に他の例を示す図である。図12に示したグラフの縦軸は、第2錘のX軸方向における変位を示す。当該グラフの横軸は、経過時間を示す。なお、図12では、当該期間が、経過時間400~経過時間800の期間として示されている。 On the other hand, FIG. 12 is output from the machine learning device 1 in the period after the machine learning device 1 is made to machine learn the temporal change of the displacement of the second weight in the X-axis direction in the double pendulum shown in FIG. It is a figure which shows still another example of the graph which plotted the temporal change of output data. The vertical axis of the graph shown in FIG. 12 shows the displacement of the second weight in the X-axis direction. The horizontal axis of the graph shows the elapsed time. In addition, in FIG. 12, the said period is shown as a period of elapsed time 400 to elapsed time 800.
 図12に示したグラフにおけるプロットPLT1は、教師データのプロットである。また、当該グラフにおけるプロットPLT7は、出力データのプロットである。図12に示したように、オンライン学習後の機械学習装置1から出力される出力データと教師データとの一致度は、オンライン学習前と比べて、高くなっている。また、図12に示したように、図12に示した例におけるオンライン学習中の機械学習装置1から出力される出力データと教師データとの一致度は、図10に示した例におけるオンライン学習中の機械学習装置1から出力される出力データと教師データと比べて、それほど変化していない。これは、図12に示した例におけるサンプルの数が、図10に示した例における中間ノードの数の5分の1となっていても、機械学習装置1が行うオンライン学習の精度が高いことを意味する。 The plot PLT1 in the graph shown in FIG. 12 is a plot of teacher data. Further, the plot PLT7 in the graph is a plot of output data. As shown in FIG. 12, the degree of agreement between the output data output from the machine learning device 1 after the online learning and the teacher data is higher than that before the online learning. Further, as shown in FIG. 12, the degree of coincidence between the output data output from the machine learning device 1 during online learning in the example shown in FIG. 12 and the teacher data is during online learning in the example shown in FIG. Compared with the output data and the teacher data output from the machine learning device 1 of the above, there is not much change. This is because even if the number of samples in the example shown in FIG. 12 is one-fifth of the number of intermediate nodes in the example shown in FIG. 10, the accuracy of online learning performed by the machine learning device 1 is high. Means.
 すなわち、機械学習装置1は、アンサンブルFORCE学習とアンサンブルカルマンフィルタ法による重み更新処理とによって、サンプルの数を少なくしつつ、オンライン学習の精度を向上させることができる。その結果、機械学習装置1は、製造コストを低減と、機械学習の精度の向上とを両立させることができる。 That is, the machine learning device 1 can improve the accuracy of online learning while reducing the number of samples by the ensemble FORCE learning and the weight update process by the ensemble Kalman filter method. As a result, the machine learning device 1 can achieve both a reduction in manufacturing cost and an improvement in machine learning accuracy.
 以上のように、実施形態に係る機械学習装置は、重みが割り当てられたエッジによって互いに結合された複数のノードを有する再帰型ニューラルネットワークを用いて、予め決められた順に並ぶ1次元以上の入力データの機械学習を行う機械学習装置であって、再帰型ニューラルネットワークは、1以上の入力ノードを有する入力層と、1以上の中間ノードを有する中間層と、1以上の出力ノードを有する出力層と、を有し、入力ノードと、中間ノードと、出力ノードとは、複数のノードのうちの互いに異なるノードであり、中間ノード同士を結合するエッジそれぞれに割り当てられた重みは、予め決められた大きさに固定されており、機械学習装置は、入力層が1次元以上の入力データを予め決められた順に受け付ける毎に、出力データ生成処理と重み更新処理とを行い、出力データ生成処理は、入力層により受け付けた入力データを入力層から中間層に出力する第1処理と、第1処理により中間層に入力された入力データに応じた1次元以上の中間データを、中間層から出力層に出力する第2処理と、第2処理により出力層に入力された1次元以上の中間データに応じた1次元以上の出力データを生成する第3処理と、を第1処理、第2処理、第3処理の順に行う処理であり、重み更新処理は、中間ノードと出力ノードとを結合するエッジそれぞれに割り当てられた重みについての推定値を成分として有するベクトルを推定重みベクトルとし、1次元以上の出力データについての予測値を成分として有するベクトルを予測出力ベクトルとし、互いに成分が異なる2以上の推定重みベクトルと、2以上の推定重みベクトル毎に算出される予測出力ベクトルとに基づいて、アンサンブルカルマンフィルタ法におけるカルマンゲイン行列を算出し、算出したカルマンゲイン行列に基づいて、中間ノードと出力ノードとを結合するエッジそれぞれに割り当てられた重みを更新する処理である。これにより、機械学習装置は、カルマンゲイン行列を算出するための行列計算を伴う再帰型ニューラルネットワークにおいて、量子化のビット数を大きくすることなく、量子化誤差に起因する数値的不安定性が生じてしまうことを抑制することができる。 As described above, the machine learning device according to the embodiment uses a recursive neural network having a plurality of nodes connected to each other by weighted edges, and input data of one dimension or more arranged in a predetermined order. A recursive neural network is an input layer having one or more input nodes, an intermediate layer having one or more intermediate nodes, and an output layer having one or more output nodes. , The input node, the intermediate node, and the output node are different nodes among a plurality of nodes, and the weight assigned to each edge connecting the intermediate nodes has a predetermined size. The machine learning device performs output data generation processing and weight update processing each time the input layer receives input data of one dimension or more in a predetermined order, and the output data generation processing is input. The first process of outputting the input data received by the layer from the input layer to the intermediate layer, and the intermediate data of one dimension or more corresponding to the input data input to the intermediate layer by the first process are output from the intermediate layer to the output layer. The first process, the second process, and the third process are the second process of the process and the third process of generating the output data of one dimension or more corresponding to the intermediate data of one dimension or more input to the output layer by the second process. The weight update process is performed in the order of processing, and the weight update process uses a vector having an estimated value for the weight assigned to each edge connecting the intermediate node and the output node as a component as an estimated weight vector, and outputs data of one dimension or more. In the ensemble Kalman filter method, a vector having a predicted value for is used as a predicted output vector, and based on two or more estimated weight vectors having different components and a predicted output vector calculated for each of two or more estimated weight vectors. This is a process of calculating a Kalman gain matrix and updating the weights assigned to each edge connecting the intermediate node and the output node based on the calculated Kalman gain matrix. As a result, the machine learning device causes numerical instability due to the quantization error in the recurrent neural network accompanied by the matrix calculation for calculating the Kalman gain matrix without increasing the number of quantization bits. It is possible to suppress the storage.
 また、機械学習装置は、重み更新処理において、2以上の推定重みベクトルに基づく2以上の予測重みベクトルを算出し、算出した2以上の予測重みベクトルに基づく予測重み誤差アンサンブルベクトルと、2以上の予測出力ベクトルに基づく予測出力誤差アンサンブルベクトルとを算出し、算出した予測重み誤差アンサンブルベクトルと、算出した予測出力誤差アンサンブルベクトルとに基づいてカルマンゲイン行列を算出する、構成が用いられてもよい。 Further, the machine learning device calculates two or more predicted weight vectors based on two or more estimated weight vectors in the weight update process, and a predicted weight error ensemble vector based on the calculated two or more predicted weight vectors and two or more predicted weight errors ensemble vectors. A configuration may be used in which the predicted output error ensemble vector based on the predicted output vector is calculated, and the Kalman gain matrix is calculated based on the calculated predicted weight error ensemble vector and the calculated predicted output error ensemble vector.
 また、機械学習装置では、出力層が有する前記出力ノードは、1つであり、予測出力ベクトルは、1次元の出力データについての予測値を成分として有するベクトルであり、カルマンゲイン行列は、複数行1列の行列である、構成が用いられてもよい。 Further, in the machine learning device, the output layer has one output node, the predicted output vector is a vector having predicted values for one-dimensional output data as components, and the Kalman gain matrix has a plurality of rows. A configuration that is a one-column matrix may be used.
 また、機械学習装置では、中間層は、リザボアである、構成が用いられてもよい。 Further, in the machine learning device, a configuration in which the intermediate layer is a reservoir may be used.
 また、機械学習装置は、少なくとも、前記重み更新処理を、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアによって行う、構成が用いられてもよい。 Further, the machine learning device may use a configuration in which at least the weight update process is performed by hardware including at least one of the near memory and the memory logic.
 以上、この発明の実施形態を、図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない限り、変更、置換、削除等されてもよい。 Although the embodiment of the present invention has been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and changes, substitutions, deletions, etc., are made as long as the gist of the present invention is not deviated. May be done.
 また、以上に説明した装置(例えば、機械学習装置1)における任意の構成部の機能を実現するためのプログラムを、コンピューター読み取り可能な記録媒体に記録し、そのプログラムをコンピューターシステムに読み込ませて実行するようにしてもよい。なお、ここでいう「コンピューターシステム」とは、OS(Operating System)や周辺機器等のハードウェアを含むものとする。また、「コンピューター読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ROM、CD(Compact Disk)-ROM等の可搬媒体、コンピューターシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピューター読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバーやクライアントとなるコンピューターシステム内部の揮発性メモリー(RAM)のように、一定時間プログラムを保持しているものも含むものとする。 Further, a program for realizing the function of an arbitrary component in the device (for example, machine learning device 1) described above is recorded on a computer-readable recording medium, and the program is read into a computer system and executed. You may try to do it. The term "computer system" as used herein includes hardware such as an OS (Operating System) and peripheral devices. The "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD (Compact Disk) -ROM, or a storage device such as a hard disk built in a computer system. .. Furthermore, a "computer-readable recording medium" is a volatile memory (RAM) inside a computer system that serves as a server or client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, it shall include those that hold the program for a certain period of time.
 また、上記のプログラムは、このプログラムを記憶装置等に格納したコンピューターシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピューターシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク(通信網)や電話回線等の通信回線(通信線)のように情報を伝送する機能を有する媒体のことをいう。
 また、上記のプログラムは、前述した機能の一部を実現するためのものであってもよい。さらに、上記のプログラムは、前述した機能をコンピューターシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル(差分プログラム)であってもよい。
Further, the above program may be transmitted from a computer system in which this program is stored in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the "transmission medium" for transmitting a program means a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
Further, the above program may be for realizing a part of the above-mentioned functions. Further, the above program may be a so-called difference file (difference program) that can realize the above-mentioned functions in combination with a program already recorded in the computer system.
1…機械学習装置、11…演算装置、12…メモリ、13…ネットワークインターフェース、L1…入力層、L2…中間層、L3…出力層 1 ... Machine learning device, 11 ... Arithmetic logic unit, 12 ... Memory, 13 ... Network interface, L1 ... Input layer, L2 ... Intermediate layer, L3 ... Output layer

Claims (8)

  1.  重みが割り当てられたエッジによって互いに結合された複数のノードを有する再帰型ニューラルネットワークを用いて、予め決められた順に並ぶ1次元以上の入力データの機械学習を行う機械学習装置であって、
     前記再帰型ニューラルネットワークは、
     1以上の入力ノードを有する入力層と、
     1以上の中間ノードを有する中間層と、
     1以上の出力ノードを有する出力層と、
     を有し、
     前記入力ノードと、前記中間ノードと、前記出力ノードとは、前記複数のノードのうちの互いに異なるノードであり、
     前記中間ノード同士を結合するエッジそれぞれに割り当てられた重みは、予め決められた大きさに固定されており、
     前記機械学習装置は、前記入力層が前記1次元以上の入力データを前記予め決められた順に受け付ける毎に、出力データ生成処理と重み更新処理とを行い、
     前記出力データ生成処理は、
     前記入力層により受け付けた前記入力データを前記入力層から前記中間層に出力する第1処理と、
     前記第1処理により前記中間層に入力された前記入力データに応じた1次元以上の中間データを、前記中間層から前記出力層に出力する第2処理と、
     前記第2処理により前記出力層に入力された前記1次元以上の中間データに応じた1次元以上の出力データを生成する第3処理と、
     を前記第1処理、前記第2処理、前記第3処理の順に行う処理であり、
     前記重み更新処理は、
     前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みについての推定値を成分として有するベクトルを推定重みベクトルとし、前記1次元以上の出力データについての予測値を成分として有するベクトルを予測出力ベクトルとし、互いに成分が異なる2以上の前記推定重みベクトルと、前記2以上の前記推定重みベクトル毎に算出される前記予測出力ベクトルとに基づいて、アンサンブルカルマンフィルタ法におけるカルマンゲイン行列を算出し、算出した前記カルマンゲイン行列に基づいて、前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みを更新する処理である、
     機械学習装置。
    A machine learning device that performs machine learning of one-dimensional or higher input data arranged in a predetermined order using a recurrent neural network having a plurality of nodes connected to each other by weighted edges.
    The recurrent neural network
    An input layer with one or more input nodes,
    An intermediate layer with one or more intermediate nodes and
    An output layer with one or more output nodes,
    Have,
    The input node, the intermediate node, and the output node are nodes that are different from each other among the plurality of nodes.
    The weight assigned to each edge connecting the intermediate nodes is fixed to a predetermined size.
    The machine learning device performs an output data generation process and a weight update process each time the input layer receives the input data of one dimension or more in the predetermined order.
    The output data generation process is
    The first process of outputting the input data received by the input layer from the input layer to the intermediate layer, and
    A second process of outputting one-dimensional or more intermediate data corresponding to the input data input to the intermediate layer by the first process from the intermediate layer to the output layer.
    The third process of generating output data of one dimension or more corresponding to the intermediate data of one dimension or more input to the output layer by the second process, and
    Is a process in which the first process, the second process, and the third process are performed in this order.
    The weight update process is
    A vector having an estimated value for the weight assigned to each edge connecting the intermediate node and the output node as a component is used as an estimated weight vector, and a vector having a predicted value for the output data of one dimension or more is used as a component. The Kalman gain matrix in the ensemble Kalman filter method is calculated based on the two or more estimated weight vectors having different components as the predicted output vectors and the predicted output vectors calculated for each of the two or more estimated weight vectors. , Is a process of updating the weight assigned to each edge connecting the intermediate node and the output node based on the calculated Kalman gain matrix.
    Machine learning device.
  2.  前記機械学習装置は、前記重み更新処理において、前記2以上の前記推定重みベクトルに基づく前記2以上の予測重みベクトルを算出し、算出した前記2以上の予測重みベクトルに基づく予測重み誤差アンサンブルベクトルと、前記2以上の前記予測出力ベクトルに基づく予測出力誤差アンサンブルベクトルとを算出し、算出した前記予測重み誤差アンサンブルベクトルと、算出した前記予測出力誤差アンサンブルベクトルとに基づいて前記カルマンゲイン行列を算出する、
     請求項1に記載の機械学習装置。
    In the weight update process, the machine learning device calculates the two or more predicted weight vectors based on the two or more estimated weight vectors, and together with the predicted weight error ensemble vector based on the calculated two or more predicted weight vectors. , The predicted output error ensemble vector based on the two or more predicted output vectors is calculated, and the Kalman gain matrix is calculated based on the calculated predicted weight error ensemble vector and the calculated predicted output error ensemble vector. ,
    The machine learning device according to claim 1.
  3.  前記出力層が有する前記出力ノードは、1つであり、
     前記予測出力ベクトルは、1次元の前記出力データについての予測値を成分として有するベクトルであり、
     前記カルマンゲイン行列は、複数行1列の行列である、
     請求項1又は2に記載の機械学習装置。
    The output layer has one output node, and the output node has one.
    The predicted output vector is a vector having a predicted value for the one-dimensional output data as a component.
    The Kalman gain matrix is a matrix having a plurality of rows and one column.
    The machine learning device according to claim 1 or 2.
  4.  前記中間層は、リザボアである、
     請求項1から3のうちいずれか一項に記載の機械学習装置。
    The intermediate layer is a reservoir,
    The machine learning device according to any one of claims 1 to 3.
  5.  少なくとも、前記重み更新処理を、ニアメモリとメモリロジックとの少なくとも一方を含むハードウェアによって行う、
     請求項1から4のうちいずれか一項に記載の機械学習装置。
    At least, the weight update process is performed by hardware including at least one of near memory and memory logic.
    The machine learning device according to any one of claims 1 to 4.
  6.  重みが割り当てられたエッジによって互いに結合された複数のノードを有する再帰型ニューラルネットワークを用いて、予め決められた順に並ぶ1次元以上の入力データの機械学習をコンピュータに行わせる機械学習プログラムであって、
     前記再帰型ニューラルネットワークは、
     1以上の入力ノードを有する入力層と、
     1以上の中間ノードを有する中間層と、
     1以上の出力ノードを有する出力層と、
     を有し、
     前記入力ノードと、前記中間ノードと、前記出力ノードとは、前記複数のノードのうちの互いに異なるノードであり、
     前記中間ノード同士を結合するエッジそれぞれに割り当てられた重みは、予め決められた大きさに固定されており、
     前記機械学習プログラムは、前記入力層が前記1次元以上の入力データを前記予め決められた順に受け付ける毎に、出力データ生成処理と重み更新処理とを行い、
     前記出力データ生成処理は、
     前記入力層により受け付けた前記入力データを前記入力層から前記中間層に出力する第1処理と、
     前記第1処理により前記中間層に入力された前記入力データに応じた1次元以上の中間データを、前記中間層から前記出力層に出力する第2処理と、
     前記第2処理により前記出力層に入力された前記1次元以上の中間データに応じた1次元以上の出力データを生成する第3処理と、
     を前記第1処理、前記第2処理、前記第3処理の順に行う処理であり、
     前記重み更新処理は、
     前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みについての推定値を成分として有するベクトルを推定重みベクトルとし、前記1次元以上の出力データについての予測値を成分として有するベクトルを予測出力ベクトルとし、互いに成分が異なる2以上の前記推定重みベクトルと、前記2以上の前記推定重みベクトル毎に算出される前記予測出力ベクトルとに基づいて、アンサンブルカルマンフィルタ法におけるカルマンゲイン行列を算出し、算出した前記カルマンゲイン行列に基づいて、前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みを更新する処理である、
     機械学習プログラム。
    A machine learning program that allows a computer to perform machine learning of one-dimensional or higher input data arranged in a predetermined order using a recurrent neural network having multiple nodes connected to each other by weighted edges. ,
    The recurrent neural network
    An input layer with one or more input nodes,
    An intermediate layer with one or more intermediate nodes and
    An output layer with one or more output nodes,
    Have,
    The input node, the intermediate node, and the output node are nodes that are different from each other among the plurality of nodes.
    The weight assigned to each edge connecting the intermediate nodes is fixed to a predetermined size.
    The machine learning program performs an output data generation process and a weight update process each time the input layer receives the input data of one dimension or more in the predetermined order.
    The output data generation process is
    The first process of outputting the input data received by the input layer from the input layer to the intermediate layer, and
    A second process of outputting one-dimensional or more intermediate data corresponding to the input data input to the intermediate layer by the first process from the intermediate layer to the output layer.
    The third process of generating output data of one dimension or more corresponding to the intermediate data of one dimension or more input to the output layer by the second process, and
    Is a process in which the first process, the second process, and the third process are performed in this order.
    The weight update process is
    A vector having an estimated value for the weight assigned to each edge connecting the intermediate node and the output node as a component is used as an estimated weight vector, and a vector having a predicted value for the output data of one dimension or more is used as a component. The Kalman gain matrix in the ensemble Kalman filter method is calculated based on the two or more estimated weight vectors having different components as the predicted output vectors and the predicted output vectors calculated for each of the two or more estimated weight vectors. , Is a process of updating the weight assigned to each edge connecting the intermediate node and the output node based on the calculated Kalman gain matrix.
    Machine learning program.
  7.  重みが割り当てられたエッジによって互いに結合された複数のノードを有する再帰型ニューラルネットワークを用いて、予め決められた順に並ぶ1次元以上の入力データの機械学習を行う機械学習方法であって、
     前記再帰型ニューラルネットワークは、
     1以上の入力ノードを有する入力層と、
     1以上の中間ノードを有する中間層と、
     1以上の出力ノードを有する出力層と、
     を有し、
     前記入力ノードと、前記中間ノードと、前記出力ノードとは、前記複数のノードのうちの互いに異なるノードであり、
     前記中間ノード同士を結合するエッジそれぞれに割り当てられた重みは、予め決められた大きさに固定されており、
     前記機械学習方法は、前記入力層が前記1次元以上の入力データを前記予め決められた順に受け付ける毎に、出力データ生成処理と重み更新処理とを行い、
     前記出力データ生成処理は、
     前記入力層により受け付けた前記入力データを前記入力層から前記中間層に出力する第1処理と、
     前記第1処理により前記中間層に入力された前記入力データに応じた1次元以上の中間データを、前記中間層から前記出力層に出力する第2処理と、
     前記第2処理により前記出力層に入力された前記1次元以上の中間データに応じた1次元以上の出力データを生成する第3処理と、
     を前記第1処理、前記第2処理、前記第3処理の順に行う処理であり、
     前記重み更新処理は、
     前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みについての推定値を成分として有するベクトルを推定重みベクトルとし、前記1次元以上の出力データについての予測値を成分として有するベクトルを予測出力ベクトルとし、互いに成分が異なる2以上の前記推定重みベクトルと、前記2以上の前記推定重みベクトル毎に算出される前記予測出力ベクトルとに基づいて、アンサンブルカルマンフィルタ法におけるカルマンゲイン行列を算出し、算出した前記カルマンゲイン行列に基づいて、前記中間ノードと前記出力ノードとを結合するエッジそれぞれに割り当てられた重みを更新する処理である、
     機械学習方法。
    A machine learning method that performs machine learning of input data of one dimension or more arranged in a predetermined order using a recurrent neural network having a plurality of nodes connected to each other by weighted edges.
    The recurrent neural network
    An input layer with one or more input nodes,
    An intermediate layer with one or more intermediate nodes and
    An output layer with one or more output nodes,
    Have,
    The input node, the intermediate node, and the output node are nodes that are different from each other among the plurality of nodes.
    The weight assigned to each edge connecting the intermediate nodes is fixed to a predetermined size.
    In the machine learning method, every time the input layer receives the input data of one dimension or more in the predetermined order, the output data generation process and the weight update process are performed.
    The output data generation process is
    The first process of outputting the input data received by the input layer from the input layer to the intermediate layer, and
    A second process of outputting one-dimensional or more intermediate data corresponding to the input data input to the intermediate layer by the first process from the intermediate layer to the output layer.
    The third process of generating output data of one dimension or more corresponding to the intermediate data of one dimension or more input to the output layer by the second process, and
    Is a process in which the first process, the second process, and the third process are performed in this order.
    The weight update process is
    A vector having an estimated value for the weight assigned to each edge connecting the intermediate node and the output node as a component is used as an estimated weight vector, and a vector having a predicted value for the output data of one dimension or more is used as a component. The Kalman gain matrix in the ensemble Kalman filter method is calculated based on the two or more estimated weight vectors having different components as the predicted output vectors and the predicted output vectors calculated for each of the two or more estimated weight vectors. , Is a process of updating the weight assigned to each edge connecting the intermediate node and the output node based on the calculated Kalman gain matrix.
    Machine learning method.
  8.  リザボアコンピューティングにおいて、重みの更新をアンサンブルカルマンフィルタ法によって行う、
     機械学習装置。
    In reservoir computing, weights are updated by the ensemble Kalman filter method.
    Machine learning device.
PCT/JP2019/025711 2019-06-27 2019-06-27 Machine learning device, machine learning program, and machine learning method WO2020261509A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2019/025711 WO2020261509A1 (en) 2019-06-27 2019-06-27 Machine learning device, machine learning program, and machine learning method
PCT/JP2020/025150 WO2020262587A1 (en) 2019-06-27 2020-06-26 Machine learning device, machine learning program, and machine learning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/025711 WO2020261509A1 (en) 2019-06-27 2019-06-27 Machine learning device, machine learning program, and machine learning method

Publications (1)

Publication Number Publication Date
WO2020261509A1 true WO2020261509A1 (en) 2020-12-30

Family

ID=74060637

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/JP2019/025711 WO2020261509A1 (en) 2019-06-27 2019-06-27 Machine learning device, machine learning program, and machine learning method
PCT/JP2020/025150 WO2020262587A1 (en) 2019-06-27 2020-06-26 Machine learning device, machine learning program, and machine learning method

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/025150 WO2020262587A1 (en) 2019-06-27 2020-06-26 Machine learning device, machine learning program, and machine learning method

Country Status (1)

Country Link
WO (2) WO2020261509A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023175722A1 (en) * 2022-03-15 2023-09-21 Tdk株式会社 Learning program and learner

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BABINEC, STEFAN ET AL.: "Merging Echo State and Feedforward Neural Networks for Time Series Forecasting", ICANN 2006, vol. 4131, 2006, Berling Heidelberg, pages 367 - 375, XP019039801 *
MIRIKITANI, DERRICK T. ET AL.: "Dynamic Modeling with Ensemble Kalman Filter Trained Recurrent Neural Networks", 2008 SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, 22 December 2008 (2008-12-22), pages 843 - 848, XP031379512, Retrieved from the Internet <URL:https://ieeexplore.ieee.org/abstract/document/4725078> [retrieved on 20190719] *

Also Published As

Publication number Publication date
WO2020262587A1 (en) 2020-12-30

Similar Documents

Publication Publication Date Title
US20200410384A1 (en) Hybrid quantum-classical generative models for learning data distributions
JP7087079B2 (en) Robust gradient weight compression scheme for deep learning applications
US11386496B2 (en) Generative network based probabilistic portfolio management
CN113657578A (en) Efficient convolutional neural network
US11210584B2 (en) Memory efficient convolution operations in deep learning neural networks
Qin et al. Deep learning of parameterized equations with applications to uncertainty quantification
US11263521B2 (en) Voltage control of learning rate for RPU devices for deep neural network training
US9959248B1 (en) Iterative simple linear regression coefficient calculation for big data using components
US20180285769A1 (en) Artificial immune system for fuzzy cognitive map learning
CN110114784A (en) Recurrent neural network
US11423051B2 (en) Sensor signal prediction at unreported time periods
CN115427967A (en) Determining multivariate time series data dependencies
US20220292315A1 (en) Accelerated k-fold cross-validation
WO2020043473A1 (en) Data prediction
Bounou et al. Online learning and control of complex dynamical systems from sensory input
US11188035B2 (en) Continuous control of attention for a deep learning network
Sun et al. PiSL: Physics-informed Spline Learning for data-driven identification of nonlinear dynamical systems
WO2020261509A1 (en) Machine learning device, machine learning program, and machine learning method
CN113490955A (en) System and method for generating a pyramid level architecture
WO2019225531A1 (en) Secret collective approximation system, secret calculation device, secret collective approximation method, and program
US20200257980A1 (en) Training optimization for neural networks with batch norm layers
US20200097813A1 (en) Deep learning model for probabilistic forecast of continuous manufacturing process
Rotman et al. Semi-supervised learning of partial differential operators and dynamical flows
AU2021271202B2 (en) Matrix sketching using analog crossbar architectures
US20220366230A1 (en) Markov processes using analog crossbar arrays

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19935315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19935315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP