WO2020184560A1 - Data prediction device, data prediction method, and data prediction program - Google Patents

Data prediction device, data prediction method, and data prediction program Download PDF

Info

Publication number
WO2020184560A1
WO2020184560A1 PCT/JP2020/010303 JP2020010303W WO2020184560A1 WO 2020184560 A1 WO2020184560 A1 WO 2020184560A1 JP 2020010303 W JP2020010303 W JP 2020010303W WO 2020184560 A1 WO2020184560 A1 WO 2020184560A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
regression coefficient
calculation method
unit
prediction
Prior art date
Application number
PCT/JP2020/010303
Other languages
French (fr)
Japanese (ja)
Inventor
高嶋 洋一
昌宏 湯口
山田 智広
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Publication of WO2020184560A1 publication Critical patent/WO2020184560A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present disclosure relates to a data prediction device, a data prediction method, and a data prediction program.
  • Equation (1) is based on 12 data from time t-2 to time t among the data for each unit time from time t-3 to time t of the four sensors 1 to 4 illustrated in FIG. Predict data D 1, t + 1 at time t + 1 of sensor 1.
  • D l, tm are the data illustrated by the circle in FIG. 1
  • l is the sensor number
  • tm is the time
  • m represents a natural number.
  • the data is predicted by the following learning process and prediction process.
  • the past data of each sensor that seems to be related to the predicted data is collected to create a linear regression equation, and the data of the sample section where the phenomenon to be predicted (unique movement of the predicted target) occurs is linearly regressiond.
  • the predicted value is continuously calculated from the sensor data by the obtained regression coefficient ⁇ k (prediction process).
  • the data of the sample interval may be insufficient and the solution of the simultaneous equations may not be obtained.
  • a mesh-shaped rainfall forecast value created by the Japan Meteorological Agency can be used instead of the data from the sensor or in addition to the data from the sensor.
  • the number of regression coefficients will increase further, and the data of the sample interval may be insufficient.
  • the regression coefficient of the approximate solution can be obtained by using the method called L1 regularization (Non-Patent Document 1) as illustrated by Eq. (2).
  • This L1 regularization has the property of bringing the regression coefficient, which is difficult to contribute to prediction, close to zero.
  • the y i is the dependent variable
  • x ij is the explanatory variable
  • t c is the adjustment parameter.
  • FIG. 2 shows a data prediction device 10 of a related technology configured by using L1 regularization.
  • the data collection unit 11 collects data from each of a large number of sensors and transmits the same data to the prediction unit 12 and the learning section selection unit 13.
  • the learning section selection unit 13 selects data in a data section in which an event to be predicted is likely to appear based on a rule, and transmits the selected data to the regression coefficient calculation unit 14 as data to be learned.
  • the regression coefficient calculation unit 14 obtains a regression coefficient using L1 regularization, and transmits the regression coefficient to the prediction unit 12.
  • the prediction unit 12 continuously calculates the prediction value by substituting the regression coefficient received from the regression coefficient calculation unit 14 and the data received from the data collection unit 11 into the linear regression equation.
  • L1 regularization makes it possible to obtain regression coefficients even when many explanatory variables are used, but the range of sensors that affect prediction and the range of data measurement time are not known in advance, so they are widespread.
  • the range will be set to. Therefore, the number of data used as input becomes very large, which puts pressure on the line capacity for collecting and transmitting data, and also causes a problem that the amount of calculation processing in the learning process becomes enormous.
  • the purpose of this disclosure is to reduce the number of data used when calculating the regression coefficient, reduce the line capacity for collecting and transmitting data, and reduce the amount of calculation processing in the learning process.
  • the data prediction device of the first aspect of the present disclosure has a data collection unit that transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and a regression coefficient by L1 regularization.
  • a regression coefficient is calculated based on the plurality of data received from the data collection unit using the first calculation method to be obtained, and data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value is selected from the calculated regression coefficients. Calculation of either the first calculation method for transmitting the selection information for the data to the data collection unit, the first calculation method when calculating the regression coefficient, or the second calculation method different from the first calculation method.
  • the first calculation is the determination unit that determines whether to use the method, the second regression coefficient calculation unit that calculates the regression coefficient using the calculation method determined by the determination unit, and the calculation method determined by the determination unit.
  • the method the plurality of data received from the data collection unit are used, and when the calculation method determined by the determination unit is the second calculation method, the selection data received from the data collection unit is used. It includes a prediction unit that outputs the prediction result predicted based on the data used and used and the regression coefficient calculated by the second regression coefficient calculation unit.
  • the second aspect of the present disclosure is the data prediction device of the first aspect, further including an error monitoring unit that monitors the number of occurrences of an error of a predetermined value or more, which is an error between the prediction result and the measured value.
  • an error monitoring unit that monitors the number of occurrences of an error of a predetermined value or more, which is an error between the prediction result and the measured value.
  • the third aspect of the present disclosure is the data prediction device of the first aspect, further including an event determination unit for determining whether or not a preset event has occurred, and the determination unit is the second calculation method. When it is determined by the event determination unit that the event has occurred in the state where it is determined to use, it is determined that the first calculation method is used.
  • the fourth aspect of the present disclosure is the data prediction device according to any one of the first to third aspects, which is an error in monitoring the number of occurrences of an error of a predetermined value or more, which is an error between the prediction result and the actually measured value.
  • the determination unit further includes a monitoring unit, and when the number of occurrences of the error monitored by the error monitoring unit exceeds a predetermined number of times, the determination unit refers to the first calculation method with respect to the first regression coefficient calculation unit. Instructs the data collection unit to recalculate the regression coefficient and retransmits the selection information, and the first regression coefficient calculation unit recalculates and selects the regression coefficient in response to the instruction from the determination unit. The information is retransmitted, and the data collection unit retransmits the selection data to the first regression coefficient calculation unit according to the selection information retransmitted in response to the instruction from the determination unit.
  • a fifth aspect of the present disclosure is a program, in which a plurality of collected data and selection data selected from the plurality of data based on the received selection information are transmitted, and a regression coefficient is obtained by L1 regularization.
  • a regression coefficient is calculated based on the plurality of received data, selection information for selecting data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value among the calculated regression coefficients is transmitted, and regression is performed.
  • it is determined which of the first calculation method and the second calculation method different from the first calculation method is used, and the regression coefficient is calculated using the determined calculation method.
  • the determined calculation method is the first calculation method
  • the received plurality of data are used
  • the determined calculation method is the second calculation method
  • the received selection data is used.
  • the computer execute a data prediction process that outputs the prediction result predicted based on the data used and the calculated regression coefficient.
  • a sixth aspect of the present disclosure is a data prediction method, in which a computer transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and returns by L1 regularization.
  • a regression coefficient is calculated based on the plurality of received data using the first calculation method for obtaining a coefficient, and selection for selecting data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value among the calculated regression coefficients.
  • the data prediction device 20 of this embodiment is illustrated in FIG.
  • the data collection unit 21 collects data from each of a large number of sensors S 1 , S 2 , S 3 , ..., And transmits the same data to the prediction unit 22 and the learning section selection unit 23.
  • the learning section selection unit 23 selects the data section in which the event to be predicted appears is predicted based on the rule, and selects the data to be learned in the first regression coefficient calculation unit 25 and the second regression coefficient calculation unit 24. Send to. That is, the learning section selection unit 23 removes unnecessary data and data with little change existing in the section in which the event to be predicted does not appear.
  • the first regression coefficient calculation unit 25 uses the first calculation method of obtaining the regression coefficient of the approximate solution by L1 regularization to a plurality of data transmitted from the data collection unit 21 and selected by the learning section selection unit 23.
  • the regression coefficient is calculated based on the calculation, the regression coefficient whose absolute value is equal to or larger than the threshold value is selected from the calculated regression coefficients, and the information indicating the selected regression coefficient is transmitted to each of the data collection unit 21 and the second regression coefficient calculation unit 24. Send.
  • the data collection unit 21 selects data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value from a plurality of collected data based on the information received from the first regression coefficient calculation unit 25, and uses the selected data as selection data. It is transmitted to each of the prediction unit 22, the learning section selection unit 23, and the error monitoring unit 26.
  • the information transmitted from the first regression coefficient calculation unit 25 to the data collection unit 21 is used for selecting data in the data collection unit 21, and is therefore referred to as selection information below.
  • data corresponding to the regression coefficient whose absolute value is equal to or more than the threshold value is selected as selection data, and the data corresponding to the regression coefficient whose absolute value is less than the threshold value is deleted. Therefore, as the selection information to be transmitted to the data collection unit 21, the data collection unit 21 uses information indicating a regression coefficient whose absolute value is less than the threshold value instead of information indicating a regression coefficient whose absolute value is less than the threshold value. Selected data may be obtained by deleting the data corresponding to the regression coefficient whose absolute value is less than the threshold value.
  • the second regression coefficient calculation unit 24 uses the selection data transmitted from the data acquisition unit 21 and passed through the learning section selection unit 23, and uses L2 regularization or multiple regression instead of L1 regularization. Ask for.
  • the regression coefficient calculated by the second regression coefficient calculation unit 24 is transmitted to the prediction unit 22 in which prediction is performed by regression calculation using L2 regularization or multiple regression.
  • the data collection unit 21 transmits the selection data to the prediction unit 22 and the learning section selection unit 23. That is, the data collection unit 21 collects data from each of all the sensors S 1 , S 2 , S 3 , ..., But transmits the data selectively.
  • the prediction unit 22 outputs the prediction result predicted by the linear regression equation using the selection data received from the data collection unit 21 and the regression coefficient received from the second regression coefficient calculation unit 24.
  • the prediction result is calculated from the data corresponding to the regression coefficient whose absolute value is less than the threshold, and the data corresponding to the regression coefficient whose absolute value is greater than or equal to the threshold and the data corresponding to the regression coefficient whose absolute value is greater than or equal to the threshold.
  • the amount of data used in the learning process can be reduced, and the amount of calculation processing can be reduced.
  • the error monitoring unit 26 monitors an error that is the difference between the prediction result by the prediction unit 22 and the actually measured value that is the data collected by the data collection unit 21, and the error of the predetermined value or more is the predetermined number of times (allowable number of times) or more.
  • a reselection instruction is transmitted to the data collection unit 21 and the first regression coefficient calculation unit 25.
  • the data collection unit 21 will transmit all the data, not the selected data, to the subsequent stage for a certain period until the reselection of the selected data is completed.
  • data is selected by L1 regularization for all data.
  • the selection data is reselected only when the error of the prediction result by L2 regularization or multiple regression is larger than the error of the prediction result by L1 regularization, and the error of the prediction result by L2 regularization or multiple regression is performed. However, if it is less than or equal to the error of the prediction result due to L1 regularization, reselection is not performed.
  • the error monitoring unit 26 monitors an error that is the difference between the prediction result by the prediction unit 22 and the measured value, and calculates the second regression coefficient when an error of a predetermined value or more occurs a predetermined number of times (allowable number of times) or more.
  • the regression coefficient may be recalculated. After recalculating the regression coefficient, if an error of a predetermined value or more occurs a predetermined number of times (allowable number of times) or more, a reselection instruction may be transmitted to the data collection unit 21 and the first regression coefficient calculation unit 25. ..
  • FIG. 4 illustrates the data prediction device 60, which is an extended example of the data prediction device 20 of the present embodiment. The description of the configuration and operation similar to that of the data prediction device 20 will be omitted as appropriate.
  • the data prediction device 60 of FIG. 4 differs from the data prediction device 20 of FIG. 3 in that it has a determination unit 67 that selects a regression coefficient calculation method to be used when calculating the regression coefficient according to a tendency of data or the like. ..
  • the data collection unit 61 collects data from a large number of sensors S 1 , S 2 , S 3 , ..., And transmits the same data to the prediction unit 62 and the learning section selection unit 63.
  • the learning section selection unit 63 selects a data section in which an event to be predicted is likely to appear based on a rule, and transmits data to be learned to the regression coefficient calculation unit 64.
  • the regression coefficient calculation unit 24 calculates the regression coefficient
  • the second regression coefficient calculation unit 64 When calculating the regression coefficient in, the regression coefficient calculation method to be used is selected according to the determination result of the determination unit 67.
  • the regression coefficient calculation method to be used is selected from, for example, the following two. 1) Multiple regression or L2 regularization using selected data (second calculation method) 2) L1 regularization using all data (first calculation method)
  • the determination unit 67 selects the regression coefficient calculation method of 1). After making a determination, the second regression coefficient calculation unit 64 calculates the regression coefficient by the regression coefficient calculation method of 1) using the selection data. When the error monitoring unit 66 detects that a prediction error of a predetermined value or more has occurred more than a predetermined number of times (allowable number of times), the determination unit 67 determines to switch the regression coefficient calculation method from 1) to 2). ..
  • the second regression coefficient calculation unit 64 calculates the regression coefficient using all the data by the regression coefficient calculation method of 2) based on the determination of the determination unit 67. As a result, a regression coefficient calculation method with less error can be selected according to the data.
  • the learning section selection unit 63 further includes an event determination unit that determines whether or not a preset event (for example, rainfall) has occurred on a rule basis, and the data to be learned by the event determination unit is updated. Even when it is determined that the determination has been made, the determination unit 67 may determine that the regression coefficient calculation method of 2) is selected. As a result, even when the data to be learned is updated, it is possible to select a regression coefficient calculation method with less error.
  • a preset event for example, rainfall
  • FIG. 5 illustrates the hardware configuration of the data prediction device 60.
  • the data prediction device 60 includes a CPU (Central Processing Unit) 51, a primary storage unit 52, a secondary storage unit 53, and an external interface 54, as shown in FIG.
  • the CPU 51 is an example of a processor that is hardware.
  • the CPU 51, the primary storage unit 52, the secondary storage unit 53, and the external interface 54 are connected to each other via the bus 59.
  • the primary storage unit 52 is, for example, a volatile memory such as a RAM (Random Access Memory).
  • the secondary storage unit 53 is, for example, a non-volatile memory such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive).
  • the secondary storage unit 53 includes a program storage area 53A and a data storage area 53B.
  • the program storage area 53A stores a program such as a data prediction program as an example.
  • the data storage area 53B stores data from the sensor, intermediate data during data prediction processing, and the like.
  • the CPU 51 reads the data prediction program from the program storage area 53A and deploys it in the primary storage unit 52. By loading and executing the data prediction program, the CPU 51 loads and executes the data collection unit 61, the prediction unit 62, the learning section selection unit 63, the second regression coefficient calculation unit 64, the first regression coefficient calculation unit 65, and the error in FIG. It operates as a monitoring unit 66 and a determination unit 67.
  • a program such as a data prediction program may be stored in an external server and expanded to the primary storage unit 52 via a network. Further, a program such as a data prediction program may be stored in a non-temporary recording medium such as Digital Versatile Disc (DVD) and expanded to the primary storage unit 52 via a recording medium reading device.
  • DVD Digital Versatile Disc
  • FIG. 5 shows an example in which the sensor 31A and the danger notification system 31B are connected to the external interface 54.
  • the sensor 31A includes a large number of sensors.
  • the data predicted by the data prediction device 60 may be transmitted to, for example, the danger notification system 31B connected to the external interface 54 and used for the danger notification processing in the danger notification system 31B. Further, the data predicted by the data prediction device 10 may be recorded in an external storage device connected to the external interface 54, for example, or displayed as characters or images on the screen of the display connected to the external interface 54. You may.
  • the data prediction device 60 may be a dedicated device or a general-purpose device such as a workstation, a personal computer, or a tablet.
  • FIG. 6 to 8 show examples of data prediction processing in the data prediction device 60.
  • FIG. 6 illustrates the flow of the learning phase.
  • step S201 the data collection unit 61 collects data from, for example, a sensor 31A including a large number of sensors.
  • the data collection unit 61 transmits the collected data to the learning section selection unit 63 in the procedure S202.
  • step S203 the learning section selection unit 63 selects a data section that seems to be suitable for prediction and transmits it to the second regression coefficient calculation unit 64 and the first regression coefficient calculation unit 65.
  • the first regression coefficient calculation unit 65 calculates the regression coefficient using L1 regularization, selects a regression coefficient whose absolute value is equal to or greater than the threshold value, and transmits it to the second regression coefficient calculation unit 64 and the data collection unit 61.
  • the second regression coefficient calculation unit 64 obtains the regression coefficient by a normal regression calculation method, for example, L2 regularization or multiple regression, using the data selected by the first regression coefficient calculation unit 65 in step S205. Is transmitted to the prediction unit 62.
  • a normal regression calculation method for example, L2 regularization or multiple regression
  • FIG. 7 illustrates the flow of the prediction phase.
  • the data acquisition unit 61 transmits the data selected in the procedure S204 to the prediction unit 62 and the error monitoring unit 66 in the procedure S206.
  • the prediction unit 62 calculates the prediction result based on the regression coefficient and the data obtained in the procedure S205 in the procedure S207, transmits it to the error monitoring unit 66, and stores it in, for example, an external storage device.
  • FIG. 8 illustrates the flow of error monitoring, recalculation, and reselection phases.
  • the error monitoring unit 66 calculates the prediction error between the prediction result and the data (measured value) in step S208, and when the prediction error of the predetermined value or more occurs more than the predetermined number of times (allowable number of times), the data collecting unit 61 and the determination Instruct unit 67 to switch the regression coefficient calculation method.
  • the data collection unit 61 transmits all the data to the learning section selection unit 63, and the determination unit 67 uses all the data in the second regression coefficient calculation unit 64. Then, it is determined that the regression coefficient is to be calculated by L1 regularization, and the determination result is transmitted to the second regression coefficient calculation unit 64 in step S209. Upon receiving the determination of the determination unit 67, the second regression coefficient calculation unit 64 switches the calculation method of the regression coefficient.
  • the error monitoring unit 66 calculates the prediction error between the prediction result and the data (actual measurement value) in step S210, and when the prediction error of the predetermined value or more occurs more than the predetermined number of times (allowable number of times), the data collection unit 61 and the determination Instruct unit 67 to reselect.
  • the data collection unit 61 transmits all the data to the learning section selection unit 63, the determination unit 67 determines that the reselection is to be performed, and the determination result is determined in step S211. 1 Transmission to the regression coefficient calculation unit 65. Upon receiving the determination of the determination unit 67, the first regression coefficient calculation unit 65 reselects the regression coefficient.
  • the prediction error tends to be larger than when only the regression coefficient calculated by the L1 regularization is used for the prediction, but when the prediction error is large or periodically, the data By reselecting, it is possible to keep the prediction error within a certain range.
  • the regression coefficient calculation method may be switched from L1 regularization using all the data to L2 regularization or multiple regression using the data selected by the first regression coefficient calculation unit 65.
  • the data prediction device 60 may include a data acquisition device 32A, a learning device 32B, and a prediction device 32C, as illustrated in FIG.
  • the data collection device 32A includes a data collection unit 61
  • the learning device 32B includes a learning section selection unit 63, a second regression coefficient calculation unit 64, a first regression coefficient calculation unit 65, an error monitoring unit 66, and a determination unit 67.
  • the prediction device 32C includes a prediction unit 62.
  • a transmission line is connected between the sensor 31A and the data collection unit 61, and between the prediction unit 62 and the output destination of the prediction result such as the danger notification system 31B. Further, in the example shown in FIG. 9, the data acquisition unit 61 and the learning section selection unit 63, and the second regression coefficient calculation unit 64 and the prediction unit 62 are also connected by a transmission line.
  • the data prediction device of the present disclosure includes a data collection unit that transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and a first calculation for obtaining a regression coefficient by L1 regularization. Selection information for calculating a regression coefficient based on the plurality of data received from the data collection unit using the method and selecting data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value among the calculated regression coefficients. Which of the first calculation method, the first calculation method, and the second calculation method different from the first calculation method is used when calculating the regression coefficient, the first regression coefficient calculation unit that transmits the data to the data collection unit.
  • the second regression coefficient calculation unit for calculating the regression coefficient using the calculation method determined by the determination unit, and the calculation method determined by the determination unit are the first calculation method.
  • the selection data received from the data collection unit is used and used. It includes a prediction unit that outputs the prediction result predicted based on the obtained data and the regression coefficient calculated by the second regression coefficient calculation unit.
  • this makes it possible to reduce the number of data, reduce the line capacity for collecting and transmitting data, and reduce the amount of calculation processing in the learning process.
  • the data collection unit 61 may reduce the number of sensors that receive the data.
  • the normal regression coefficient calculation method may be used for the calculation by the second regression coefficient calculation unit 65, the calculation cost can be reduced and the processing in an environment with few calculation resources is possible. ..

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The purpose of the present invention is to reduce the quantity of data used to calculate regression coefficients, reduce the bandwidth for collecting and transmitting data, and reduce computation throughput in a learning process. A data prediction device according to the present disclosure includes: a first regression coefficient computation unit that computes regression coefficients on the basis of a plurality of data and using a first computation method for calculating regression coefficients by L1 regularization, and transmits selection information for selecting data corresponding to the regression coefficients having an absolute value equal to or greater than a threshold value to a data collection unit; a determination unit that determines which of a first computation method and a second computation method to use; a second regression coefficient computation unit that computes regression coefficients using the determined computation method; and a prediction unit that uses the plurality of data if the computation method is determined to be the first computation method and uses selected data if the computation method is determined to be the second computation method, and outputs a prediction result that was predicted on the basis of the data used and the regression coefficients computed by the second regression coefficient computation unit.

Description

データ予測装置、データ予測方法、及びデータ予測プログラムData forecasting device, data forecasting method, and data forecasting program
 本開示は、データ予測装置、データ予測方法、及びデータ予測プログラムに関する。 The present disclosure relates to a data prediction device, a data prediction method, and a data prediction program.
 各所に設置した大量のセンサから取得されるデータから、その中の特定のデータの近い将来の値を予測する問題では、(1)式に例示するように、図1に示す線形回帰による予測が行われることがある。下記(1)式は、図1に例示する1~4の4つのセンサの時刻t-3から時刻tまでの単位時間毎のデータのうち時刻t-2から時刻tまでの12個のデータからセンサ1の時刻t+1におけるデータD1,t+1を予測する。
β+β1,t-2+β2,t-2+β3,t-2+β4,t-2
β1,t-1+β2,t-1+β3,t-1+β4,t-1
β1,t+β102,t+β113,t+β124,t=D1,t+1 …(1)
In the problem of predicting the near-future value of specific data in the data acquired from a large amount of sensors installed in various places, as illustrated in Eq. (1), the prediction by linear regression shown in FIG. 1 is performed. May be done. The following equation (1) is based on 12 data from time t-2 to time t among the data for each unit time from time t-3 to time t of the four sensors 1 to 4 illustrated in FIG. Predict data D 1, t + 1 at time t + 1 of sensor 1.
β 0 + β 1 D 1, t-2 + β 2 D 2, t-2 + β 3 D 3, t-2 + β 4 D 4, t-2 +
β 5 D 1, t-1 + β 6 D 2, t-1 + β 7 D 3, t-1 + β 8 D 4, t-1 +
β 9 D 1, t + β 10 D 2, t + β 11 D 3, t + β 12 D 4, t = D 1, t + 1 … (1)
 βk(k=0,…,12)は回帰係数であり、Dl,t-mは、図1に円で例示するデータであり、lはセンサの番号、t-mは時刻を表し、mは自然数を表す。 β k (k = 0, ..., 12) is the regression coefficient, D l, tm are the data illustrated by the circle in FIG. 1, l is the sensor number, tm is the time, and so on. m represents a natural number.
 詳細には、以下の学習過程及び予測過程によりデータの予測が行われる。まず、予測するデータに関係しそうな各センサの過去のデータを収集して線形回帰式を作成し、予測しようとする現象(予測対象の特異な動き)が生じているサンプル区間のデータを線形回帰式の選択変数に代入して、回帰係数βkを連立方程式により求める(学習過程)。次に、求められた回帰係数βkにより、センサのデータから連続的に予測値を算出する(予測過程)。 In detail, the data is predicted by the following learning process and prediction process. First, the past data of each sensor that seems to be related to the predicted data is collected to create a linear regression equation, and the data of the sample section where the phenomenon to be predicted (unique movement of the predicted target) occurs is linearly regressiond. Substitute it in the selection variable of the equation and obtain the regression coefficient β k by the simultaneous equations (learning process). Next, the predicted value is continuously calculated from the sensor data by the obtained regression coefficient β k (prediction process).
 非常に多量のセンサからの過去の長い期間のデータを用いる場合には、サンプル区間のデータが不足し、連立方程式の解が求まらなくなる場合がある。例えば、河川の水位の予想を行う場合、センサからのデータの代わり、あるいは、センサからのデータに追加して、例えば、気象庁で作成されるメッシュ状の雨量の予報値を用いることができる。しかしながら、この場合はさらに回帰係数の数が増えることになり、サンプル区間のデータが不足する場合がある。 When using data from a very large amount of sensors over a long period of time in the past, the data of the sample interval may be insufficient and the solution of the simultaneous equations may not be obtained. For example, when predicting the water level of a river, a mesh-shaped rainfall forecast value created by the Japan Meteorological Agency can be used instead of the data from the sensor or in addition to the data from the sensor. However, in this case, the number of regression coefficients will increase further, and the data of the sample interval may be insufficient.
 サンプル区間のデータが不足した場合、L1正則化という方法(非特許文献1)を用いて、(2)式に例示するように、近似解の回帰係数を求めることができる。このL1正則化には、予測に貢献しにくい回帰係数をゼロに近づける性質がある。
Figure JPOXMLDOC01-appb-M000001

 
は目的変数であり、xijは説明変数であり、tは調整パラメータである。
When the data of the sample interval is insufficient, the regression coefficient of the approximate solution can be obtained by using the method called L1 regularization (Non-Patent Document 1) as illustrated by Eq. (2). This L1 regularization has the property of bringing the regression coefficient, which is difficult to contribute to prediction, close to zero.
Figure JPOXMLDOC01-appb-M000001


The y i is the dependent variable, x ij is the explanatory variable, t c is the adjustment parameter.
 L1正則化を用いて構成される、関連技術のデータ予測装置10を図2に示す。データ収集部11は多数のセンサの各々からデータを収集し、予測部12及び学習区間選択部13に同じデータを送信する。学習区間選択部13では、予測しようとする事象が現れていそうなデータ区間内のデータをルールベースで選択し、選択したデータを学習するべきデータとして回帰係数算出部14に送信する。回帰係数算出部14では、L1正則化を用いて回帰係数を求め、当該回帰係数を予測部12に送信する。予測部12は、回帰係数算出部14から受信した回帰係数と、データ収集部11から受信したデータと、を線形回帰式に代入して連続的に予測値を算出する。 FIG. 2 shows a data prediction device 10 of a related technology configured by using L1 regularization. The data collection unit 11 collects data from each of a large number of sensors and transmits the same data to the prediction unit 12 and the learning section selection unit 13. The learning section selection unit 13 selects data in a data section in which an event to be predicted is likely to appear based on a rule, and transmits the selected data to the regression coefficient calculation unit 14 as data to be learned. The regression coefficient calculation unit 14 obtains a regression coefficient using L1 regularization, and transmits the regression coefficient to the prediction unit 12. The prediction unit 12 continuously calculates the prediction value by substituting the regression coefficient received from the regression coefficient calculation unit 14 and the data received from the data collection unit 11 into the linear regression equation.
 L1正則化により、多くの説明変数を使用する場合にも回帰係数を求めることはできるようになるが、予測に影響を及ぼすセンサの範囲、及びデータ計測時間の範囲は事前に既知でないため、広めに範囲を設定することになる。そのため、入力として使用するデータの数が非常に多くなり、データを収集及び伝送する回線容量を圧迫し、また、学習過程における計算処理量も膨大になる、という問題が発生する。 L1 regularization makes it possible to obtain regression coefficients even when many explanatory variables are used, but the range of sensors that affect prediction and the range of data measurement time are not known in advance, so they are widespread. The range will be set to. Therefore, the number of data used as input becomes very large, which puts pressure on the line capacity for collecting and transmitting data, and also causes a problem that the amount of calculation processing in the learning process becomes enormous.
 本開示では、回帰係数を求める際に使用するデータの数を低減し、データを収集及び伝送する回線容量を低減し、学習過程における計算処理量を低減する、ことを目的とする。 The purpose of this disclosure is to reduce the number of data used when calculating the regression coefficient, reduce the line capacity for collecting and transmitting data, and reduce the amount of calculation processing in the learning process.
 本開示の第1態様のデータ予測装置は、収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信するデータ収集部と、L1正則化により回帰係数を求める第1算出法を用いて、前記データ収集部から受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を前記データ収集部に送信する第1回帰係数算出部と、回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定する判定部と、前記判定部で判定された算出法を用いて回帰係数を算出する第2回帰係数算出部と、前記判定部で判定された算出法が前記第1算出法である場合、前記データ収集部から受信した前記複数のデータを使用し、前記判定部で判定された算出法が前記第2算出法である場合、前記データ収集部から受信した前記選択データを使用し、使用したデータ及び前記第2回帰係数算出部で算出された回帰係数に基づいて予測した予測結果を出力する予測部と、を含む。 The data prediction device of the first aspect of the present disclosure has a data collection unit that transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and a regression coefficient by L1 regularization. A regression coefficient is calculated based on the plurality of data received from the data collection unit using the first calculation method to be obtained, and data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value is selected from the calculated regression coefficients. Calculation of either the first calculation method for transmitting the selection information for the data to the data collection unit, the first calculation method when calculating the regression coefficient, or the second calculation method different from the first calculation method. The first calculation is the determination unit that determines whether to use the method, the second regression coefficient calculation unit that calculates the regression coefficient using the calculation method determined by the determination unit, and the calculation method determined by the determination unit. In the case of the method, the plurality of data received from the data collection unit are used, and when the calculation method determined by the determination unit is the second calculation method, the selection data received from the data collection unit is used. It includes a prediction unit that outputs the prediction result predicted based on the data used and used and the regression coefficient calculated by the second regression coefficient calculation unit.
 本開示の第2態様は、第1態様のデータ予測装置であって、前記予測結果と実測値との誤差であって所定値以上の誤差の発生回数を監視する誤差監視部をさらに含み、前記判定部は、前記第2算出法を用いると判定した状態で、前記誤差監視部で監視された前記誤差の発生回数が所定回数以上になった場合は、前記第1算出法を用いると判定する。 The second aspect of the present disclosure is the data prediction device of the first aspect, further including an error monitoring unit that monitors the number of occurrences of an error of a predetermined value or more, which is an error between the prediction result and the measured value. When the determination unit determines that the second calculation method is to be used and the number of occurrences of the error monitored by the error monitoring unit exceeds a predetermined number, the determination unit determines that the first calculation method is used. ..
 本開示の第3態様は、第1態様のデータ予測装置であって、予め設定された事象が発生したか否かを判定する事象判定部をさらに含み、前記判定部は、前記第2算出法を用いると判定した状態で、前記事象判定部で前記事象が発生したことが判定された場合、前記第1算出法を用いると判定する。 The third aspect of the present disclosure is the data prediction device of the first aspect, further including an event determination unit for determining whether or not a preset event has occurred, and the determination unit is the second calculation method. When it is determined by the event determination unit that the event has occurred in the state where it is determined to use, it is determined that the first calculation method is used.
 本開示の第4態様は、第1態様~第3態様のいずれかのデータ予測装置であって、前記予測結果と実測値との誤差であって所定値以上の誤差の発生回数を監視する誤差監視部をさらに含み、前記判定部は、前記誤差監視部で監視された前記誤差の発生回数が所定回数以上になった場合は、前記第1回帰係数算出部に対して、前記第1算出法による回帰係数の再算出の指示、及び前記データ収集部に対する選択情報の再送信の指示を行い、前記第1回帰係数算出部は、前記判定部からの指示に応じて回帰係数の再算出及び選択情報の再送信を行い、前記データ収集部は、前記判定部からの指示に応じて再送信された選択情報に応じて、前記第1回帰係数算出部に対する選択データの再送信を行う。 The fourth aspect of the present disclosure is the data prediction device according to any one of the first to third aspects, which is an error in monitoring the number of occurrences of an error of a predetermined value or more, which is an error between the prediction result and the actually measured value. The determination unit further includes a monitoring unit, and when the number of occurrences of the error monitored by the error monitoring unit exceeds a predetermined number of times, the determination unit refers to the first calculation method with respect to the first regression coefficient calculation unit. Instructs the data collection unit to recalculate the regression coefficient and retransmits the selection information, and the first regression coefficient calculation unit recalculates and selects the regression coefficient in response to the instruction from the determination unit. The information is retransmitted, and the data collection unit retransmits the selection data to the first regression coefficient calculation unit according to the selection information retransmitted in response to the instruction from the determination unit.
 本開示の第5態様はプログラムであって、収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信し、L1正則化により回帰係数を求める第1算出法を用いて、受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を送信し、回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定し、判定された算出法を用いて回帰係数を算出し、判定された算出法が前記第1算出法である場合、受信した前記複数のデータを使用し、判定された算出法が前記第2算出法である場合、受信した前記選択データを使用し、使用したデータ及び算出された回帰係数に基づいて予測した予測結果を出力する、データ予測処理をコンピュータに実行させる。 A fifth aspect of the present disclosure is a program, in which a plurality of collected data and selection data selected from the plurality of data based on the received selection information are transmitted, and a regression coefficient is obtained by L1 regularization. Using the method, a regression coefficient is calculated based on the plurality of received data, selection information for selecting data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value among the calculated regression coefficients is transmitted, and regression is performed. When calculating the coefficient, it is determined which of the first calculation method and the second calculation method different from the first calculation method is used, and the regression coefficient is calculated using the determined calculation method. When the determined calculation method is the first calculation method, the received plurality of data are used, and when the determined calculation method is the second calculation method, the received selection data is used. Have the computer execute a data prediction process that outputs the prediction result predicted based on the data used and the calculated regression coefficient.
 本開示の第6態様は、データ予測方法であって、コンピュータが、収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信し、L1正則化により回帰係数を求める第1算出法を用いて、受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を送信し、回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定し、判定された算出法を用いて回帰係数を算出し、判定された算出法が前記第1算出法である場合、受信した前記複数のデータを使用し、判定された算出法が前記第2算出法である場合、受信した前記選択データを使用し、使用したデータ及び算出された回帰係数に基づいて予測した予測結果を出力する。 A sixth aspect of the present disclosure is a data prediction method, in which a computer transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and returns by L1 regularization. A regression coefficient is calculated based on the plurality of received data using the first calculation method for obtaining a coefficient, and selection for selecting data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value among the calculated regression coefficients. When transmitting information and calculating the regression coefficient, it is determined which of the first calculation method and the second calculation method different from the first calculation method is used, and the determined calculation method is used. When the determined calculation method is the first calculation method, the plurality of received data are used, and when the determined calculation method is the second calculation method, the received said. The selected data is used, and the predicted result predicted based on the used data and the calculated regression coefficient is output.
 本開示では、データの数を低減し、データを収集及び伝送する回線容量を低減し、学習過程における計算処理量を低減する、ことができる。 In the present disclosure, it is possible to reduce the number of data, reduce the line capacity for collecting and transmitting data, and reduce the amount of calculation processing in the learning process.
線形回帰式によるデータの予測を説明する模式図である。It is a schematic diagram explaining the prediction of data by a linear regression equation. 関連技術のデータ予測装置を例示するブロック図である。It is a block diagram which illustrates the data prediction apparatus of the related technology. 本実施形態のデータ予測装置の基本例を例示するブロック図である。It is a block diagram which illustrates the basic example of the data prediction apparatus of this embodiment. 本実施形態のデータ予測装置の拡張例を例示するブロック図である。It is a block diagram which illustrates the extended example of the data prediction apparatus of this embodiment. 本実施形態のデータ予測装置の拡張例のハードウェア構成図を例示するブロック図である。It is a block diagram which illustrates the hardware block diagram of the extended example of the data prediction apparatus of this embodiment. 本実施形態の拡張例の学習フェーズの処理の流れを例示する模式図である。It is a schematic diagram which illustrates the process flow of the learning phase of the extended example of this embodiment. 本実施形態の拡張例の予測フェーズの処理の流れを例示する模式図である。It is a schematic diagram which illustrates the process flow of the prediction phase of the extended example of this embodiment. 本実施形態の拡張例の再算出及び再選択フェーズの処理の流れを例示する模式図である。It is a schematic diagram which illustrates the flow of the process of the recalculation and the reselection phase of the extended example of this embodiment. 本実施形態のデータ予測装置を例示するブロック図である。It is a block diagram which illustrates the data prediction apparatus of this embodiment.
 本実施形態のデータ予測装置20を図3に例示する。データ収集部21は多数のセンサS,S,S,…の各々からデータを収集し、予測部22及び学習区間選択部23に同じデータを送信する。データ収集部21が、予測部22及び学習区間選択部23に送信するデータとしては、多数のセンサS,S,S,…から収集した複数のデータ、及び後述する第1回帰係数算出部25から受信した選択情報に基づいて収集した複数のデータから選択した選択データがある。 The data prediction device 20 of this embodiment is illustrated in FIG. The data collection unit 21 collects data from each of a large number of sensors S 1 , S 2 , S 3 , ..., And transmits the same data to the prediction unit 22 and the learning section selection unit 23. Data collection unit 21, as the data to be transmitted to the prediction unit 22 and the learning section selecting unit 23, a number of sensors S 1, S 2, S 3 , ... a plurality of data collected from, and the first regression coefficient calculation described later There is selection data selected from a plurality of data collected based on the selection information received from the unit 25.
 学習区間選択部23では、予測しようとする事象が現れていることが予測されるデータ区間をルールベースで選択し、学習するべきデータを第1回帰係数算出部25及び第2回帰係数算出部24に送信する。すなわち、学習区間選択部23は、予測しようとする事象が現れていないことが予測される区間に存在する不要なデータ及び変化の乏しいデータを取り除く。 The learning section selection unit 23 selects the data section in which the event to be predicted appears is predicted based on the rule, and selects the data to be learned in the first regression coefficient calculation unit 25 and the second regression coefficient calculation unit 24. Send to. That is, the learning section selection unit 23 removes unnecessary data and data with little change existing in the section in which the event to be predicted does not appear.
 第1回帰係数算出部25は、L1正則化により近似解の回帰係数を求める第1算出法を用いて、データ収集部21から送信され、かつ学習区間選択部23で選択された複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数を選択し、選択した回帰係数を示す情報を、データ収集部21及び第2回帰係数算出部24の各々に送信する。 The first regression coefficient calculation unit 25 uses the first calculation method of obtaining the regression coefficient of the approximate solution by L1 regularization to a plurality of data transmitted from the data collection unit 21 and selected by the learning section selection unit 23. The regression coefficient is calculated based on the calculation, the regression coefficient whose absolute value is equal to or larger than the threshold value is selected from the calculated regression coefficients, and the information indicating the selected regression coefficient is transmitted to each of the data collection unit 21 and the second regression coefficient calculation unit 24. Send.
 データ収集部21では、第1回帰係数算出部25から受信した情報に基づいて、収集した複数のデータから絶対値が閾値以上の回帰係数に対応するデータを選択し、選択したデータを選択データとして予測部22、学習区間選択部23、及び誤差監視部26の各々に送信する。第1回帰係数算出部25からデータ収集部21に送信される情報は、データ収集部21においてデータを選択するために使用されるので、以下では選択情報という。 The data collection unit 21 selects data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value from a plurality of collected data based on the information received from the first regression coefficient calculation unit 25, and uses the selected data as selection data. It is transmitted to each of the prediction unit 22, the learning section selection unit 23, and the error monitoring unit 26. The information transmitted from the first regression coefficient calculation unit 25 to the data collection unit 21 is used for selecting data in the data collection unit 21, and is therefore referred to as selection information below.
 なお、データ収集部21では、絶対値が閾値以上の回帰係数に対応するデータが選択データとして選択され、絶対値が閾値未満の回帰係数に対応するデータが削除される。このため、データ収集部21に送信する選択情報としては、絶対値が閾値以上の回帰係数を示す情報に代えて、絶対値が閾値未満の回帰係数を示す情報を用い、データ収集部21において、絶対値が閾値未満の回帰係数に対応するデータを削除することにより、選択データを得るようにしてもよい。 In the data collection unit 21, data corresponding to the regression coefficient whose absolute value is equal to or more than the threshold value is selected as selection data, and the data corresponding to the regression coefficient whose absolute value is less than the threshold value is deleted. Therefore, as the selection information to be transmitted to the data collection unit 21, the data collection unit 21 uses information indicating a regression coefficient whose absolute value is less than the threshold value instead of information indicating a regression coefficient whose absolute value is less than the threshold value. Selected data may be obtained by deleting the data corresponding to the regression coefficient whose absolute value is less than the threshold value.
 第2回帰係数算出部24は、データ収集部21から送信され、かつ学習区間選択部23を通過した選択データを使用して、L1正則化ではなく、L2正則化または重回帰を用いて回帰係数を求める。 The second regression coefficient calculation unit 24 uses the selection data transmitted from the data acquisition unit 21 and passed through the learning section selection unit 23, and uses L2 regularization or multiple regression instead of L1 regularization. Ask for.
 第2回帰係数算出部24で算出された回帰係数は、L2正則化または重回帰を用いた回帰計算による予測が行われる予測部22に送信される。データ収集部21では、選択データを予測部22及び学習区間選択部23に送信する。即ち、データ収集部21は、全てのセンサS,S,S,…の各々からデータを収集するが、データの送信は、選択的に行う。 The regression coefficient calculated by the second regression coefficient calculation unit 24 is transmitted to the prediction unit 22 in which prediction is performed by regression calculation using L2 regularization or multiple regression. The data collection unit 21 transmits the selection data to the prediction unit 22 and the learning section selection unit 23. That is, the data collection unit 21 collects data from each of all the sensors S 1 , S 2 , S 3 , ..., But transmits the data selectively.
 予測部22では、データ収集部21から受信した選択データと、第2回帰係数算出部24から受信した回帰係数とを用いて、線形回帰式で予測した予測結果を出力する。 The prediction unit 22 outputs the prediction result predicted by the linear regression equation using the selection data received from the data collection unit 21 and the regression coefficient received from the second regression coefficient calculation unit 24.
 すなわち、予測結果は、絶対値が閾値未満の回帰係数に対応するデータが削除され、絶対値が閾値以上の回帰係数と絶対値が閾値以上の回帰係数に対応するデータとから算出されるため、学習過程で使用するデータ量が低減され、計算処理量を低減することができる。 That is, the prediction result is calculated from the data corresponding to the regression coefficient whose absolute value is less than the threshold, and the data corresponding to the regression coefficient whose absolute value is greater than or equal to the threshold and the data corresponding to the regression coefficient whose absolute value is greater than or equal to the threshold. The amount of data used in the learning process can be reduced, and the amount of calculation processing can be reduced.
 誤差監視部26では、予測部22による予測結果と、データ収集部21で収集されたデータである実測値との差である誤差を監視し、所定値以上の誤差が所定回数(許容回数)以上発生した場合に、データ収集部21と第1回帰係数算出部25に、再選択指示を送信する。データ収集部21は、この再選択指示により、選択データの再選択が終わるまでの一定期間、選択データではなく、全てのデータを後段に送信するようになり、第1回帰係数算出部25では、再度、全てのデータを対象にしたL1正則化によるデータの選択が行われる。この際、L2正則化または重回帰による予測結果の誤差が、L1正則化による予測結果の誤差よりも大きい場合にのみ、選択データの再選択を行い、L2正則化または重回帰による予測結果の誤差が、L1正則化による予測結果の誤差以下である場合には、再選択を行わない。 The error monitoring unit 26 monitors an error that is the difference between the prediction result by the prediction unit 22 and the actually measured value that is the data collected by the data collection unit 21, and the error of the predetermined value or more is the predetermined number of times (allowable number of times) or more. When it occurs, a reselection instruction is transmitted to the data collection unit 21 and the first regression coefficient calculation unit 25. By this reselection instruction, the data collection unit 21 will transmit all the data, not the selected data, to the subsequent stage for a certain period until the reselection of the selected data is completed. Once again, data is selected by L1 regularization for all data. At this time, the selection data is reselected only when the error of the prediction result by L2 regularization or multiple regression is larger than the error of the prediction result by L1 regularization, and the error of the prediction result by L2 regularization or multiple regression is performed. However, if it is less than or equal to the error of the prediction result due to L1 regularization, reselection is not performed.
 なお、誤差監視部26では、予測部22による予測結果と、実測値の差である誤差を監視し、所定値以上の誤差が所定回数(許容回数)以上発生した場合に、第2回帰係数算出部24で、回帰係数の再算出を行うようにしてもよい。回帰係数の再算出後、所定値以上の誤差が所定回数(許容回数)以上発生した場合に、データ収集部21と第1回帰係数算出部25に、再選択指示を送信するようにしてもよい。 The error monitoring unit 26 monitors an error that is the difference between the prediction result by the prediction unit 22 and the measured value, and calculates the second regression coefficient when an error of a predetermined value or more occurs a predetermined number of times (allowable number of times) or more. In part 24, the regression coefficient may be recalculated. After recalculating the regression coefficient, if an error of a predetermined value or more occurs a predetermined number of times (allowable number of times) or more, a reselection instruction may be transmitted to the data collection unit 21 and the first regression coefficient calculation unit 25. ..
 本実施形態のデータ予測装置20の拡張例であるデータ予測装置60を図4に例示する。データ予測装置20と同様の構成及び作用については、説明を適宜省略する。図4のデータ予測装置60は、データの傾向などに応じて、回帰係数の算出の際に使用する回帰係数算出法を選択する判定部67を有する点で、図3のデータ予測装置20と異なる。 FIG. 4 illustrates the data prediction device 60, which is an extended example of the data prediction device 20 of the present embodiment. The description of the configuration and operation similar to that of the data prediction device 20 will be omitted as appropriate. The data prediction device 60 of FIG. 4 differs from the data prediction device 20 of FIG. 3 in that it has a determination unit 67 that selects a regression coefficient calculation method to be used when calculating the regression coefficient according to a tendency of data or the like. ..
 データ収集部61は多数のセンサS,S,S,…からデータを収集し、予測部62及び学習区間選択部63に同じデータを送信する。学習区間選択部63では、予測しようとする事象が現れていそうなデータ区間をルールベースで選択し、学習するべきデータを回帰係数算出部64に送信する。 The data collection unit 61 collects data from a large number of sensors S 1 , S 2 , S 3 , ..., And transmits the same data to the prediction unit 62 and the learning section selection unit 63. The learning section selection unit 63 selects a data section in which an event to be predicted is likely to appear based on a rule, and transmits data to be learned to the regression coefficient calculation unit 64.
 データ予測装置20では、第2回帰係数算出部24で回帰係数を算出する際に、例えば、L2正則化または重回帰を使用しているが、データ予測装置60では、第2回帰係数算出部64で回帰係数を算出する際に、判定部67の判定結果に応じて、使用する回帰係数算出法を選択する。使用する回帰係数算出法は、例えば、以下の2つから選択される。
1)選択データを使用する重回帰またはL2正則化(第2算出法)
2)全てのデータを使用するL1正則化(第1算出法)
In the data prediction device 20, for example, L2 regularization or multiple regression is used when the second regression coefficient calculation unit 24 calculates the regression coefficient, but in the data prediction device 60, the second regression coefficient calculation unit 64 When calculating the regression coefficient in, the regression coefficient calculation method to be used is selected according to the determination result of the determination unit 67. The regression coefficient calculation method to be used is selected from, for example, the following two.
1) Multiple regression or L2 regularization using selected data (second calculation method)
2) L1 regularization using all data (first calculation method)
 誤差監視部66において、所定値以上の予測誤差(誤差)が所定回数(許容回数)以上発生したことが検出されるまでは、判定部67は、1)の回帰係数算出法を選択することを判定し、第2回帰係数算出部64は、選択データを使用して、1)の回帰係数算出法で回帰係数を算出する。誤差監視部66において、所定値以上の予測誤差が所定回数(許容回数)以上発生したことが検出された場合、判定部67は、回帰係数算出法を1)から2)に切り替えることを判定する。 Until the error monitoring unit 66 detects that a prediction error (error) of a predetermined value or more has occurred more than a predetermined number of times (allowable number of times), the determination unit 67 selects the regression coefficient calculation method of 1). After making a determination, the second regression coefficient calculation unit 64 calculates the regression coefficient by the regression coefficient calculation method of 1) using the selection data. When the error monitoring unit 66 detects that a prediction error of a predetermined value or more has occurred more than a predetermined number of times (allowable number of times), the determination unit 67 determines to switch the regression coefficient calculation method from 1) to 2). ..
 第2回帰係数算出部64は、判定部67の判定に基づいて、2)の回帰係数算出法で全てのデータを使用して回帰係数を算出する。これにより、データに応じて、より誤差の少ない回帰係数算出法を選択することができる。 The second regression coefficient calculation unit 64 calculates the regression coefficient using all the data by the regression coefficient calculation method of 2) based on the determination of the determination unit 67. As a result, a regression coefficient calculation method with less error can be selected according to the data.
 なお、学習区間選択部63において、事前に設定された事象(例えば、降雨など)が発生したか否かをルールベースで判定する事象判定部をさらに含み、事象判定部が学習するべきデータが更新されたことを判定した場合にも、判定部67は、2)の回帰係数算出法を選択することを判定してもよい。これにより、学習するべきデータが更新された場合においても、より誤差の少ない回帰係数算出法を選択することができる。 The learning section selection unit 63 further includes an event determination unit that determines whether or not a preset event (for example, rainfall) has occurred on a rule basis, and the data to be learned by the event determination unit is updated. Even when it is determined that the determination has been made, the determination unit 67 may determine that the regression coefficient calculation method of 2) is selected. As a result, even when the data to be learned is updated, it is possible to select a regression coefficient calculation method with less error.
 図5に、データ予測装置60のハードウェア構成を例示する。データ予測装置60は、一例として、図5に示すように、CPU(Central Processing Unit)51、一次記憶部52、二次記憶部53、及び、外部インタフェース54を含む。CPU51は、ハードウェアであるプロセッサの一例である。CPU51、一次記憶部52、二次記憶部53、及び、外部インタフェース54は、バス59を介して相互に接続されている。 FIG. 5 illustrates the hardware configuration of the data prediction device 60. As an example, the data prediction device 60 includes a CPU (Central Processing Unit) 51, a primary storage unit 52, a secondary storage unit 53, and an external interface 54, as shown in FIG. The CPU 51 is an example of a processor that is hardware. The CPU 51, the primary storage unit 52, the secondary storage unit 53, and the external interface 54 are connected to each other via the bus 59.
 一次記憶部52は、例えば、RAM(Random Access Memory)などの揮発性のメモリである。二次記憶部53は、例えば、HDD(Hard Disk Drive)、又はSSD(Solid State Drive)などの不揮発性のメモリである。 The primary storage unit 52 is, for example, a volatile memory such as a RAM (Random Access Memory). The secondary storage unit 53 is, for example, a non-volatile memory such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive).
 二次記憶部53は、プログラム格納領域53A及びデータ格納領域53Bを含む。プログラム格納領域53Aは、一例として、データ予測プログラムなどのプログラムを記憶している。データ格納領域53Bは、一例として、センサからのデータ及びデータ予測処理中の中間データなどを記憶する。 The secondary storage unit 53 includes a program storage area 53A and a data storage area 53B. The program storage area 53A stores a program such as a data prediction program as an example. As an example, the data storage area 53B stores data from the sensor, intermediate data during data prediction processing, and the like.
 CPU51は、プログラム格納領域53Aからデータ予測プログラムを読み出して一次記憶部52に展開する。CPU51は、データ予測プログラムをロードして実行することで、図4のデータ収集部61、予測部62、学習区間選択部63、第2回帰係数算出部64、第1回帰係数算出部65、誤差監視部66、及び判定部67として動作する。 The CPU 51 reads the data prediction program from the program storage area 53A and deploys it in the primary storage unit 52. By loading and executing the data prediction program, the CPU 51 loads and executes the data collection unit 61, the prediction unit 62, the learning section selection unit 63, the second regression coefficient calculation unit 64, the first regression coefficient calculation unit 65, and the error in FIG. It operates as a monitoring unit 66 and a determination unit 67.
 なお、データ予測プログラムなどのプログラムは、外部サーバに記憶され、ネットワークを介して、一次記憶部52に展開されてもよい。また、データ予測プログラムなどのプログラムは、Digital Versatile Disc(DVD)などの非一時的記録媒体に記憶され、記録媒体読込装置を介して、一次記憶部52に展開されてもよい。 A program such as a data prediction program may be stored in an external server and expanded to the primary storage unit 52 via a network. Further, a program such as a data prediction program may be stored in a non-temporary recording medium such as Digital Versatile Disc (DVD) and expanded to the primary storage unit 52 via a recording medium reading device.
 外部インタフェース54には外部装置が接続され、外部インタフェース54は、外部装置とCPU51との間の各種情報の送受信を司る。図5では、外部インタフェース54に、センサ31A、及び危険報知システム31Bが接続されている例を示している。なお、センサ31Aは、多数のセンサを含む。 An external device is connected to the external interface 54, and the external interface 54 controls the transmission and reception of various information between the external device and the CPU 51. FIG. 5 shows an example in which the sensor 31A and the danger notification system 31B are connected to the external interface 54. The sensor 31A includes a large number of sensors.
 データ予測装置60で予測されたデータは、例えば、外部インタフェース54に接続される危険報知システム31Bに送信され、危険報知システム31Bにおける危険報知処理に使用されてもよい。また、データ予測装置10で予測されたデータは、例えば、外部インタフェース54に接続される外部記憶装置に記録されてもよいし、外部インタフェース54に接続されるディスプレイの画面に文字または画像として表示されてもよい。 The data predicted by the data prediction device 60 may be transmitted to, for example, the danger notification system 31B connected to the external interface 54 and used for the danger notification processing in the danger notification system 31B. Further, the data predicted by the data prediction device 10 may be recorded in an external storage device connected to the external interface 54, for example, or displayed as characters or images on the screen of the display connected to the external interface 54. You may.
 また、データ予測装置60は、専用装置であってもよいし、ワークステーション、パーソナルコンピュータ、またはタブレットなどの汎用装置であってもよい。 Further, the data prediction device 60 may be a dedicated device or a general-purpose device such as a workstation, a personal computer, or a tablet.
 図6~図8に、データ予測装置60におけるデータ予測処理を例示する。図6は、学習フェーズの流れを例示する。 6 to 8 show examples of data prediction processing in the data prediction device 60. FIG. 6 illustrates the flow of the learning phase.
 データ収集部61は、手順S201で、例えば、多数のセンサを含むセンサ31Aからデータを収集する。データ収集部61は、手順S202で、学習区間選択部63に、収集したデータを送信する。 In step S201, the data collection unit 61 collects data from, for example, a sensor 31A including a large number of sensors. The data collection unit 61 transmits the collected data to the learning section selection unit 63 in the procedure S202.
 学習区間選択部63は、手順S203で、予測に適切そうなデータ区間を選択して第2回帰係数算出部64と第1回帰係数算出部65に送信する。第1回帰係数算出部65は、L1正則化を用いて回帰係数を算出し、絶対値が閾値以上の回帰係数を選択して、第2回帰係数算出部64とデータ収集部61に送信する。 In step S203, the learning section selection unit 63 selects a data section that seems to be suitable for prediction and transmits it to the second regression coefficient calculation unit 64 and the first regression coefficient calculation unit 65. The first regression coefficient calculation unit 65 calculates the regression coefficient using L1 regularization, selects a regression coefficient whose absolute value is equal to or greater than the threshold value, and transmits it to the second regression coefficient calculation unit 64 and the data collection unit 61.
 第2回帰係数算出部64は、手順S205で、第1回帰係数算出部65で選択されたデータを使用して、通常の回帰算出法、例えば、L2正則化または重回帰により、回帰係数を求めて予測部62に送信する。 The second regression coefficient calculation unit 64 obtains the regression coefficient by a normal regression calculation method, for example, L2 regularization or multiple regression, using the data selected by the first regression coefficient calculation unit 65 in step S205. Is transmitted to the prediction unit 62.
 図7は、予測フェーズの流れを例示する。データ収集部61は、手順S206で、手順S204で選択されたデータを予測部62及び誤差監視部66に送信する。予測部62は、手順S207で、手順S205で求めた回帰係数とデータとに基づいて、予測結果を算出し、誤差監視部66に送信すると共に、例えば、外部記憶装置に記憶する。 FIG. 7 illustrates the flow of the prediction phase. The data acquisition unit 61 transmits the data selected in the procedure S204 to the prediction unit 62 and the error monitoring unit 66 in the procedure S206. The prediction unit 62 calculates the prediction result based on the regression coefficient and the data obtained in the procedure S205 in the procedure S207, transmits it to the error monitoring unit 66, and stores it in, for example, an external storage device.
 図8は、誤差監視及び再算出及び再選択フェーズの流れを例示する。誤差監視部66は、手順S208で、予測結果とデータ(実測値)との予測誤差を算出し、所定値以上の予測誤差が所定回数(許容回数)以上発生した場合、データ収集部61及び判定部67に、回帰係数算出法の切り替えを指示する。 FIG. 8 illustrates the flow of error monitoring, recalculation, and reselection phases. The error monitoring unit 66 calculates the prediction error between the prediction result and the data (measured value) in step S208, and when the prediction error of the predetermined value or more occurs more than the predetermined number of times (allowable number of times), the data collecting unit 61 and the determination Instruct unit 67 to switch the regression coefficient calculation method.
 回帰係数算出法の切り替えが指示されると、データ収集部61は、全てのデータを学習区間選択部63に送信し、判定部67は、第2回帰係数算出部64で、全てのデータを使用して、L1正則化により回帰係数を算出することを判定し、手順S209で、当該判定結果を第2回帰係数算出部64に送信する。第2回帰係数算出部64は、判定部67の判定を受信すると、回帰係数の算出法を切り替える。 When the switching of the regression coefficient calculation method is instructed, the data collection unit 61 transmits all the data to the learning section selection unit 63, and the determination unit 67 uses all the data in the second regression coefficient calculation unit 64. Then, it is determined that the regression coefficient is to be calculated by L1 regularization, and the determination result is transmitted to the second regression coefficient calculation unit 64 in step S209. Upon receiving the determination of the determination unit 67, the second regression coefficient calculation unit 64 switches the calculation method of the regression coefficient.
 誤差監視部66は、手順S210で、予測結果とデータ(実測値)との予測誤差を算出し、所定値以上の予測誤差が所定回数(許容回数)以上発生した場合、データ収集部61及び判定部67に、再選択を指示する。 The error monitoring unit 66 calculates the prediction error between the prediction result and the data (actual measurement value) in step S210, and when the prediction error of the predetermined value or more occurs more than the predetermined number of times (allowable number of times), the data collection unit 61 and the determination Instruct unit 67 to reselect.
 再選択が指示されると、データ収集部61は、全てのデータを学習区間選択部63に送信し、判定部67は、再選択を行うことを判定し、手順S211で、当該判定結果を第1回帰係数算出部65に送信する。第1回帰係数算出部65は、判定部67の判定を受信すると、回帰係数の再選択を行う。 When the reselection is instructed, the data collection unit 61 transmits all the data to the learning section selection unit 63, the determination unit 67 determines that the reselection is to be performed, and the determination result is determined in step S211. 1 Transmission to the regression coefficient calculation unit 65. Upon receiving the determination of the determination unit 67, the first regression coefficient calculation unit 65 reselects the regression coefficient.
 本実施形態によれば、L1正則化で算出した回帰係数だけを予測に用いる場合より、予測誤差が大きくなる傾向を示す場合があるが、予測誤差が大きい場合、あるいは、定期的に、データの再選択を行うことにより、予測誤差をある範囲に収めることが可能である。 According to this embodiment, the prediction error tends to be larger than when only the regression coefficient calculated by the L1 regularization is used for the prediction, but when the prediction error is large or periodically, the data By reselecting, it is possible to keep the prediction error within a certain range.
 なお、回帰係数算出法を、第1回帰係数算出部65で選択済みのデータを使用するL2正則化または重回帰から全てのデータを使用するL1正則化に切り替える例について説明したが、本実施形態はこれに限定されない。回帰係数算出法を、全てのデータを使用するL1正則化から、第1回帰係数算出部65で選択済みのデータを使用するL2正則化または重回帰に切り替えるようにしてもよい。 An example of switching the regression coefficient calculation method from L2 regularization using the data selected by the first regression coefficient calculation unit 65 or multiple regression to L1 regularization using all the data has been described. Is not limited to this. The regression coefficient calculation method may be switched from L1 regularization using all the data to L2 regularization or multiple regression using the data selected by the first regression coefficient calculation unit 65.
 なお、データ予測装置60は、図9に例示するように、データ収集装置32A、学習装置32B及び予測装置32Cを含んでいてもよい。データ収集装置32Aは、データ収集部61を含み、学習装置32Bは、学習区間選択部63、第2回帰係数算出部64、第1回帰係数算出部65、誤差監視部66、及び判定部67を含み、予測装置32Cは予測部62を含む。 Note that the data prediction device 60 may include a data acquisition device 32A, a learning device 32B, and a prediction device 32C, as illustrated in FIG. The data collection device 32A includes a data collection unit 61, and the learning device 32B includes a learning section selection unit 63, a second regression coefficient calculation unit 64, a first regression coefficient calculation unit 65, an error monitoring unit 66, and a determination unit 67. Including, the prediction device 32C includes a prediction unit 62.
 センサ31Aとデータ収集部61との間、及び、予測部62と、例えば、危険報知システム31Bなどの予測結果の出力先との間は、伝送回線で接続されている。また、図9に示される例では、データ収集部61と学習区間選択部63との間、第2回帰係数算出部64と予測部62との間も伝送回線で接続されている。 A transmission line is connected between the sensor 31A and the data collection unit 61, and between the prediction unit 62 and the output destination of the prediction result such as the danger notification system 31B. Further, in the example shown in FIG. 9, the data acquisition unit 61 and the learning section selection unit 63, and the second regression coefficient calculation unit 64 and the prediction unit 62 are also connected by a transmission line.
 本開示のデータ予測装置は、収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信するデータ収集部と、L1正則化により回帰係数を求める第1算出法を用いて、前記データ収集部から受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を前記データ収集部に送信する第1回帰係数算出部と、回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定する判定部と、前記判定部で判定された算出法を用いて回帰係数を算出する第2回帰係数算出部と、前記判定部で判定された算出法が前記第1算出法である場合、前記データ収集部から受信した前記複数のデータを使用し、前記判定部で判定された算出法が前記第2算出法である場合、前記データ収集部から受信した前記選択データを使用し、使用したデータ及び前記第2回帰係数算出部で算出された回帰係数に基づいて予測した予測結果を出力する予測部と、を含む。 The data prediction device of the present disclosure includes a data collection unit that transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and a first calculation for obtaining a regression coefficient by L1 regularization. Selection information for calculating a regression coefficient based on the plurality of data received from the data collection unit using the method and selecting data corresponding to a regression coefficient whose absolute value is equal to or greater than a threshold value among the calculated regression coefficients. Which of the first calculation method, the first calculation method, and the second calculation method different from the first calculation method is used when calculating the regression coefficient, the first regression coefficient calculation unit that transmits the data to the data collection unit. When the determination unit for determining the above, the second regression coefficient calculation unit for calculating the regression coefficient using the calculation method determined by the determination unit, and the calculation method determined by the determination unit are the first calculation method. When the plurality of data received from the data collection unit is used and the calculation method determined by the determination unit is the second calculation method, the selection data received from the data collection unit is used and used. It includes a prediction unit that outputs the prediction result predicted based on the obtained data and the regression coefficient calculated by the second regression coefficient calculation unit.
 本開示では、これにより、データの数を低減し、データを収集及び伝送する回線容量を低減し、学習過程における計算処理量を低減する、ことができる。なお、データ収集部61から後段に送信するデータの数を低減する代わりに、データ収集部61がデータを受信するセンサの数を低減してもよい。 In the present disclosure, this makes it possible to reduce the number of data, reduce the line capacity for collecting and transmitting data, and reduce the amount of calculation processing in the learning process. Instead of reducing the number of data transmitted from the data collection unit 61 to the subsequent stage, the data collection unit 61 may reduce the number of sensors that receive the data.
 本開示では、第2回帰係数算出部65での算出に、通常の回帰係数算出法を使用する場合があるため、算出コストを低減することができ、算出リソースの少ない環境における処理を可能にする。 In the present disclosure, since the normal regression coefficient calculation method may be used for the calculation by the second regression coefficient calculation unit 65, the calculation cost can be reduced and the processing in an environment with few calculation resources is possible. ..
61 データ収集部
62 予測部
66 誤差監視部
64 第2回帰係数算出部
65 第1回帰係数算出部
31A センサ
51 CPU
52 一次記憶部
53 二次記憶部
61 Data acquisition unit 62 Prediction unit 66 Error monitoring unit 64 Second regression coefficient calculation unit 65 First regression coefficient calculation unit 31A Sensor 51 CPU
52 Primary storage 53 Secondary storage

Claims (6)

  1.  収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信するデータ収集部と、
     L1正則化により回帰係数を求める第1算出法を用いて、前記データ収集部から受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を前記データ収集部に送信する第1回帰係数算出部と、
     回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定する判定部と、
     前記判定部で判定された算出法を用いて回帰係数を算出する第2回帰係数算出部と、
     前記判定部で判定された算出法が前記第1算出法である場合、前記データ収集部から受信した前記複数のデータを使用し、前記判定部で判定された算出法が前記第2算出法である場合、前記データ収集部から受信した前記選択データを使用し、使用したデータ及び前記第2回帰係数算出部で算出された回帰係数に基づいて予測した予測結果を出力する予測部と、
     を含む、データ予測装置。
    A data collection unit that transmits a plurality of collected data and selection data selected from the plurality of data based on the received selection information, and a data collection unit.
    The regression coefficient is calculated based on the plurality of data received from the data collection unit using the first calculation method for obtaining the regression coefficient by L1 regularization, and the regression coefficient whose absolute value is equal to or larger than the threshold value among the calculated regression coefficients is calculated. A first regression coefficient calculation unit that transmits selection information for selecting data corresponding to the above to the data collection unit, and
    A determination unit for determining which of the first calculation method and the second calculation method different from the first calculation method is used when calculating the regression coefficient.
    A second regression coefficient calculation unit that calculates the regression coefficient using the calculation method determined by the determination unit, and
    When the calculation method determined by the determination unit is the first calculation method, the plurality of data received from the data collection unit are used, and the calculation method determined by the determination unit is the second calculation method. In some cases, a prediction unit that uses the selected data received from the data acquisition unit and outputs a prediction result predicted based on the used data and the regression coefficient calculated by the second regression coefficient calculation unit.
    Data predictors, including.
  2.  前記予測結果と実測値との誤差であって所定値以上の誤差の発生回数を監視する誤差監視部をさらに含み、
     前記判定部は、前記第2算出法を用いると判定した状態で、前記誤差監視部で監視された前記誤差の発生回数が所定回数以上になった場合は、前記第1算出法を用いると判定する、
     請求項1に記載のデータ予測装置。
    It further includes an error monitoring unit that monitors the number of occurrences of an error between the predicted result and the measured value, which is equal to or greater than a predetermined value.
    When the determination unit determines that the second calculation method is to be used and the number of occurrences of the error monitored by the error monitoring unit exceeds a predetermined number, the determination unit determines that the first calculation method is used. To do,
    The data prediction device according to claim 1.
  3.  予め設定された事象が発生したか否かを判定する事象判定部をさらに含み、
     前記判定部は、前記第2算出法を用いると判定した状態で、前記事象判定部で前記事象が発生したことが判定された場合、前記第1算出法を用いると判定する、
     請求項1に記載のデータ予測装置。
    It also includes an event determination unit that determines whether or not a preset event has occurred.
    When the event determination unit determines that the event has occurred in a state where the determination unit has determined to use the second calculation method, the determination unit determines to use the first calculation method.
    The data prediction device according to claim 1.
  4.  前記予測結果と実測値との誤差であって所定値以上の誤差の発生回数を監視する誤差監視部をさらに含み、
     前記判定部は、前記誤差監視部で監視された前記誤差の発生回数が所定回数以上になった場合は、前記第1回帰係数算出部に対して、前記第1算出法による回帰係数の再算出の指示、及び前記データ収集部に対する選択情報の再送信の指示を行い、
     前記第1回帰係数算出部は、前記判定部からの指示に応じて回帰係数の再算出及び選択情報の再送信を行い、
     前記データ収集部は、前記判定部からの指示に応じて再送信された選択情報に応じて、前記第1回帰係数算出部に対する選択データの再送信を行う、
     請求項1~請求項3のいずれか1項に記載のデータ予測装置。
    It further includes an error monitoring unit that monitors the number of occurrences of an error between the predicted result and the measured value, which is equal to or greater than a predetermined value.
    When the number of occurrences of the error monitored by the error monitoring unit exceeds a predetermined number, the determination unit recalculates the regression coefficient by the first calculation method with respect to the first regression coefficient calculation unit. And instruct the data collection unit to retransmit the selection information.
    The first regression coefficient calculation unit recalculates the regression coefficient and retransmits the selection information in response to an instruction from the determination unit.
    The data acquisition unit retransmits the selection data to the first regression coefficient calculation unit according to the selection information retransmitted in response to the instruction from the determination unit.
    The data prediction device according to any one of claims 1 to 3.
  5.  収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信し、
     L1正則化により回帰係数を求める第1算出法を用いて、受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を送信し、
     回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定し、
     判定された算出法を用いて回帰係数を算出し、
     判定された算出法が前記第1算出法である場合、受信した前記複数のデータを使用し、判定された算出法が前記第2算出法である場合、受信した前記選択データを使用し、使用したデータ及び算出された回帰係数に基づいて予測した予測結果を出力する、
     データ予測処理をコンピュータに実行させるためのプログラム。
    The selected data selected from the plurality of collected data and the received selection information is transmitted, and the selected data is transmitted.
    The regression coefficient is calculated based on the plurality of received data using the first calculation method for obtaining the regression coefficient by L1 regularization, and among the calculated regression coefficients, the data corresponding to the regression coefficient whose absolute value is equal to or larger than the threshold value is obtained. Send selection information to make a selection,
    When calculating the regression coefficient, it is determined which of the first calculation method and the second calculation method different from the first calculation method is used.
    Calculate the regression coefficient using the determined calculation method,
    When the determined calculation method is the first calculation method, the received plurality of data are used, and when the determined calculation method is the second calculation method, the received selection data is used and used. Output the prediction result predicted based on the obtained data and the calculated regression coefficient.
    A program that causes a computer to perform data prediction processing.
  6.  コンピュータが、
     収集した複数のデータ、及び受信した選択情報に基づいて前記複数のデータから選択した選択データを送信し、
     L1正則化により回帰係数を求める第1算出法を用いて、受信した前記複数のデータに基づいて回帰係数を算出し、算出した回帰係数のうち絶対値が閾値以上の回帰係数に対応するデータを選択するための選択情報を送信し、
     回帰係数を算出する際に、前記第1算出法、及び前記第1算出法と異なる第2算出法のいずれの算出法を用いるかを判定し、
     判定された算出法を用いて回帰係数を算出し、
     判定された算出法が前記第1算出法である場合、受信した前記複数のデータを使用し、判定された算出法が前記第2算出法である場合、受信した前記選択データを使用し、使用したデータ及び算出された回帰係数に基づいて予測した予測結果を出力する、
     データ予測方法。
    The computer
    The selected data selected from the plurality of collected data and the received selection information is transmitted, and the selected data is transmitted.
    The regression coefficient is calculated based on the plurality of received data using the first calculation method for obtaining the regression coefficient by L1 regularization, and among the calculated regression coefficients, the data corresponding to the regression coefficient whose absolute value is equal to or larger than the threshold value is obtained. Send selection information to make a selection,
    When calculating the regression coefficient, it is determined which of the first calculation method and the second calculation method different from the first calculation method is used.
    Calculate the regression coefficient using the determined calculation method,
    When the determined calculation method is the first calculation method, the received plurality of data are used, and when the determined calculation method is the second calculation method, the received selection data is used and used. Output the prediction result predicted based on the obtained data and the calculated regression coefficient.
    Data prediction method.
PCT/JP2020/010303 2019-03-13 2020-03-10 Data prediction device, data prediction method, and data prediction program WO2020184560A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-045555 2019-03-13
JP2019045555A JP2020149281A (en) 2019-03-13 2019-03-13 Data prediction device, data prediction method, and data prediction program

Publications (1)

Publication Number Publication Date
WO2020184560A1 true WO2020184560A1 (en) 2020-09-17

Family

ID=72426576

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/010303 WO2020184560A1 (en) 2019-03-13 2020-03-10 Data prediction device, data prediction method, and data prediction program

Country Status (2)

Country Link
JP (1) JP2020149281A (en)
WO (1) WO2020184560A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017130041A (en) * 2016-01-20 2017-07-27 富士通株式会社 Information processing device, control method and control program
JP2018010477A (en) * 2016-07-13 2018-01-18 富士通株式会社 Sensor control device, sensor system, sensor control method, and sensor control program
JP2018109876A (en) * 2017-01-04 2018-07-12 株式会社東芝 Sensor design support apparatus, sensor design support method and computer program
JP2018151883A (en) * 2017-03-13 2018-09-27 株式会社東芝 Analysis device, analysis method, and program
JP2019032185A (en) * 2017-08-04 2019-02-28 株式会社東芝 Sensor control support apparatus, sensor control support method, and computer program

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017130041A (en) * 2016-01-20 2017-07-27 富士通株式会社 Information processing device, control method and control program
JP2018010477A (en) * 2016-07-13 2018-01-18 富士通株式会社 Sensor control device, sensor system, sensor control method, and sensor control program
JP2018109876A (en) * 2017-01-04 2018-07-12 株式会社東芝 Sensor design support apparatus, sensor design support method and computer program
JP2018151883A (en) * 2017-03-13 2018-09-27 株式会社東芝 Analysis device, analysis method, and program
JP2019032185A (en) * 2017-08-04 2019-02-28 株式会社東芝 Sensor control support apparatus, sensor control support method, and computer program

Also Published As

Publication number Publication date
JP2020149281A (en) 2020-09-17

Similar Documents

Publication Publication Date Title
JP6002250B2 (en) Time-series data processing apparatus and method, and storage medium
JP6679086B2 (en) Learning device, prediction device, learning method, prediction method, and program
JP5867349B2 (en) Quality prediction apparatus, operation condition determination method, quality prediction method, computer program, and computer-readable storage medium
CN106330754B (en) Access request control method and device
JP5699715B2 (en) Data storage device and data storage method
JP7393883B2 (en) System and method for characterizing time series of arbitrary length using preselected signatures
JP6184344B2 (en) Observation value processing equipment
WO2020184561A1 (en) Data prediction device, data prediction method, and data prediction program
CN110764714A (en) Data processing method, device and equipment and readable storage medium
JPWO2015174067A1 (en) Information processing apparatus, abnormality detection method, and recording medium
WO2020184560A1 (en) Data prediction device, data prediction method, and data prediction program
KR102158100B1 (en) Auto monitoring method and apparatus by using anomaly detection
Gupta et al. Assessment of temporal change in the tails of probability distribution of daily precipitation over India due to climatic shift in the 1970s
JP7367622B2 (en) Data management system, data management method, and data management program
JP2020076744A (en) Method for adaptively estimating remaining useful life, which uses constrained convex regression from deterioration measurement
JP4496139B2 (en) Rainfall forecast providing device and rainfall forecast receiving terminal device
US11614560B2 (en) Integration of physical sensors in a data assimilation framework
KR102511439B1 (en) Electronic device for acquiring enf signal and operating method thereof
KR102417675B1 (en) Expression apparatus for level of disaster crisis, and control method thereof
JP4820747B2 (en) TRAVEL TIME CALCULATION DEVICE, PROGRAM, AND RECORDING MEDIUM
JP2020513101A (en) System and method for predicting snowfall probability distribution
JP2019185240A (en) Data collecting device and method
JP7081757B2 (en) Power generation amount prediction device, power generation amount prediction system, and power generation amount prediction method
JP6922999B2 (en) Information processing equipment, information processing methods and programs
JP4822955B2 (en) Rainfall forecast providing device and rainfall forecast receiving terminal device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20770735

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20770735

Country of ref document: EP

Kind code of ref document: A1