WO2024009667A1 - Information processing device, inference model generation method, training data generation method, inference model generation program, and training data generation program - Google Patents

Information processing device, inference model generation method, training data generation method, inference model generation program, and training data generation program Download PDF

Info

Publication number
WO2024009667A1
WO2024009667A1 PCT/JP2023/020963 JP2023020963W WO2024009667A1 WO 2024009667 A1 WO2024009667 A1 WO 2024009667A1 JP 2023020963 W JP2023020963 W JP 2023020963W WO 2024009667 A1 WO2024009667 A1 WO 2024009667A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
time series
series data
information processing
inference model
Prior art date
Application number
PCT/JP2023/020963
Other languages
French (fr)
Japanese (ja)
Inventor
翔太 林
裕司 白石
Original Assignee
日立造船株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日立造船株式会社 filed Critical 日立造船株式会社
Publication of WO2024009667A1 publication Critical patent/WO2024009667A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • G06F18/15Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2123/00Data types
    • G06F2123/02Data types in the time domain, e.g. time-series data

Definitions

  • the present invention relates to an information processing device, etc. that generates an inference model.
  • Patent Document 1 listed below discloses a neural network model that infers the type of event occurring in a plant from a plurality of plant data acquired in the plant. Note that the above-mentioned "multiple plant data" is time-series data collected at one plant.
  • An object of one aspect of the present invention is to provide an information processing device and the like that can generate a highly versatile inference model.
  • an information processing device connects a plurality of time series data based on data collected for each of a plurality of targets to create one pseudo time series data.
  • a data linking unit that generates data
  • a teacher data generation unit that performs standardization processing or normalization processing on the pseudo time series data and uses it as training data
  • learning that generates an inference model by machine learning using the training data. It is equipped with a section and a section.
  • another information processing device connects a plurality of time-series data based on data collected for each of a plurality of objects to create one pseudo
  • the data linking section generates time series data
  • a teacher data generating section performs standardization processing or normalization processing on the pseudo time series data to produce teacher data.
  • an inference model generation method is an inference model generation method that is executed by one or more information processing devices, and is performed for each of a plurality of targets.
  • a learning step of generating an inference model by machine learning using the teacher data is executed by one or more information processing devices, and is performed for each of a plurality of targets.
  • a method for generating teacher data is a method for generating teacher data executed by one or more information processing apparatuses, and for each of a plurality of targets.
  • FIG. 1 is a block diagram showing an example of the configuration of main parts of an information processing device according to an embodiment of the present invention.
  • FIG. 1 is a diagram showing the configuration of an information processing system including the information processing device.
  • FIG. 7 is a diagram illustrating an example of determining connection targets.
  • FIG. 3 is a diagram showing an example of generation of time series data.
  • FIG. 3 is a diagram showing an example of generation of pseudo time series data and teacher data.
  • FIG. 2 is a flowchart illustrating an example of a process when the information processing device generates an inference model.
  • FIG. 1 is a block diagram showing an example of the configuration of main parts of an information processing device according to an embodiment of the present invention.
  • FIG. 1 is a diagram showing the configuration of an information processing system including the information processing device.
  • FIG. 7 is a diagram illustrating an example of determining connection targets.
  • FIG. 3 is a diagram showing an example of generation of time series data.
  • FIG. 3 is a diagram showing an example of generation
  • FIG. 2 is a diagram showing the configuration of the information processing system 5 according to this embodiment.
  • the information processing system 5 includes information processing devices 1A to 1D and an information processing device 2.
  • the information processing system 5 can generate an inference model using a plurality of time series data from different sources.
  • the inference model generated by the information processing system 5 is more versatile than the inference model generated using time series data from a single source as training data.
  • the information processing devices 1A to 1D are devices that collect time-series data that is the source of training data for generating the above-mentioned inference model, and are located in facilities A to D, respectively. In the following, when there is no need to distinguish between the information processing apparatuses 1A to 1D, they will simply be referred to as "information processing apparatus 1.”
  • Facilities A to D may be equipped with at least one piece of equipment.
  • facilities A to D may be plants equipped with multiple pieces of equipment.
  • a “plant” is an industrially used facility, and is equipped with a plurality of devices. The “plant” uses these devices to perform predetermined processing such as production of products or processing of objects.
  • facilities A to D are waste treatment facilities that incinerate waste (for example, combustible garbage) and generate electricity using the waste heat.
  • the information processing device 1A collects time-series data regarding waste incineration and power generation at the facility A.
  • the information processing devices 1B to 1D collect time-series data regarding waste incineration and power generation at the facilities B to D.
  • the time-series data may be the source of the training data for generating the inference model, and may be in accordance with the content of the inference. For example, when generating an inference model that predicts the combustion state of an incinerator, it is sufficient to collect various time-series data related to the combustion state of the incinerator.
  • time-series sensing data related to automatic combustion control (ACC) such as time-series measurements of furnace temperature and time-series measurements of the amount of steam generated in a boiler, are It may also be collected as data.
  • ACC automatic combustion control
  • the information processing device 2 connects the time series data collected by the information processing devices 1A to 1D to generate one time series data. Then, the information processing device 2 performs standardization processing or normalization processing on the connected time series data and uses it as training data. The information processing device 2 generates an inference model by machine learning using the teacher data.
  • the inference model generated in this way is more versatile than when time series data from a single source is used as training data.
  • an inference model generated using training data generated from each time-series data sourced from facilities A to D can be used for inference at any of facilities A to D, and if the same type of facility It can also be used for reasoning in facilities other than facilities A to D.
  • the generation of the inference model by the information processing device 2 does not necessarily require data collected at the facility that is the target of inference. Therefore, for example, when a new facility is constructed, the inference model can be used immediately after the facility starts operating. Further, it is not necessarily necessary that all of the information processing devices 1A to 1D collect time-series data that is the source of teacher data.
  • time-series data from facilities A to C, which is the source of the teacher data
  • inference model without collecting time-series data, which is the source of the teacher data, from facility D.
  • This inference model can also be used for inference at facility D.
  • the information processing system 5 since the information processing system 5 generates teacher data from time-series data collected at a plurality of facilities, it is possible to collect the required number of teacher data in a short period of time and quickly generate an inference model.
  • time-series data to be linked is not limited to data collected at multiple different facilities.
  • the information processing device 2 can also connect time-series data collected about a plurality of different pieces of equipment or equipment in one facility.
  • the information processing device 2 can connect the time-series data collected at each incinerator and use it for inference in any of the two incinerators. It is possible to generate an inference model. Therefore, "facility" in the following description can be replaced with any "object”.
  • time-series data is also arbitrary.
  • measured values measured by a sensor installed on an arbitrary object itself may be collected as time-series data.
  • measured values measured by sensors installed around an arbitrary object, or data measured over a wider range of objects, such as temperature and humidity may be collected as time-series data.
  • setting values for setting the operation of an arbitrary object, command values for causing an arbitrary object to execute a predetermined operation, etc. may be collected as time-series data.
  • FIG. 1 is a block diagram showing an example of the configuration of main parts of information processing apparatuses 1 and 2.
  • the information processing device 1 includes a control section 10 that centrally controls each section of the information processing device 1, and a storage section 11 that stores various data used by the information processing device 1.
  • the information processing device 1 also includes a communication unit 12 for the information processing device 1 to communicate with other devices, an input unit 13 that receives input of various data to the information processing device 1, and an input unit 13 for the information processing device 1 to output various data. It is equipped with an output section 14 for outputting data.
  • the control unit 10 also includes a data acquisition unit 101, a preprocessing unit 102, an inference unit 103, and a control amount determining unit 104.
  • the data acquisition unit 101 acquires time-series data that is the source of training data for generating an inference model.
  • the time series data may be acquired from a sensor or the like placed in the facility via the communication unit 12 or the input unit 13, or may be input by the user of the information processing device 1 via the input unit 13.
  • the data acquisition unit 101 then transmits the acquired data to the information processing device 2 via the communication unit 12.
  • the data acquisition unit 101 acquires an inference model generated using the above data transmitted to the information processing device 2.
  • the inference model may be acquired from the information processing device 2 through communication via the communication unit 12, or may be input by the user of the information processing device 1 via the input unit 13.
  • the data acquisition unit 101 acquires data for inference when performing inference using the acquired inference model.
  • this data will be referred to as inference data.
  • the inference data may be input via the communication unit 12 or the input unit 13 from a sensor or the like placed in a facility to be inferred, or may be input via the input unit 13 by a user of the information processing device 1. You can.
  • the preprocessing unit 102 performs preprocessing on the inference data acquired by the data acquisition unit 101, and generates input data to be input to the above inference model.
  • the above pre-processing is a process of standardizing or normalizing the inference data by applying the same conditions as when the information processing device 2 generates the training data of the inference model.
  • the inference unit 103 performs inference using the inference model generated by the information processing device 2. More specifically, the inference unit 103 directly uses the output value obtained by inputting the input data generated by the preprocessing unit 102 into the inference model as the inference result, or generates the inference result based on the output value. obtain.
  • the inference target of the inference model is not particularly limited, and may predict the combustion state in the incinerators provided in the facilities A to D shown in FIG. 2, for example.
  • the combustion state changes, and the amount of steam changes accordingly.
  • the information processing devices 1A to 1D predict the combustion state in the incinerators provided in the facilities A to D by using an inference model that predicts the combustion state.
  • the information processing devices 1A to 1D can appropriately control the equipment in the facilities A to D according to the prediction results, and can stably perform waste incineration and power generation.
  • the control amount determination unit 104 determines the control amount for the equipment installed in the target facility based on the inference result of the inference unit 103.
  • the method for determining the control amount differs depending on what kind of inference model was used to perform the inference. For example, suppose that an inference result using an inference model that predicts the amount of steam generated from the boiler of an incinerator indicates that the amount of steam generated will decrease.
  • the control amount determination unit 104 may determine the control amount of equipment that affects the amount of steam generated from the boiler (for example, the amount of air supplied to the incinerator and the operating speed of the grate).
  • the control amount is determined so that the amount of generated steam increases. For example, the control amount is determined to increase the amount of air supplied into the incinerator or to increase the operating speed of the grate.
  • the information processing device 2 includes a control section 20 that centrally controls each section of the information processing device 2, and a storage section 21 that stores various data used by the information processing device 2.
  • the information processing device 2 also includes a communication unit 22 for the information processing device 2 to communicate with other devices, an input unit 23 that receives input of various data to the information processing device 2, and an input unit 23 for the information processing device 2 to output various data. It is equipped with an output section 24 for outputting data.
  • the control unit 20 also includes a data acquisition unit 201, a connection target determination unit 202, a time series data generation unit 203, a data connection unit 204, a teacher data generation unit 205, and a learning unit 206.
  • the data acquisition unit 201 acquires data that is the source of teacher data.
  • the data acquired by the data acquisition unit 201 only needs to include data that becomes an explanatory variable of the estimated model to be generated or data that becomes the source of the explanatory variable.
  • the data acquisition unit 201 transmits data to the information processing device 1A shown in FIG. 2 via the communication unit 22. ⁇ 1D may be communicated with to obtain the sensing data.
  • the concatenation target determining unit 202 determines the data to be concatenated by the data concatenation unit 204 from among the data acquired by the data acquisition unit 201.
  • the connection target determination unit 202 is not an essential configuration. However, by including the connection target determination unit 202, even if the data acquired by the data acquisition unit 201 includes data that is not suitable for connection, or if there are combinations that cannot be connected, appropriate training data can be obtained. It has the advantage that it can be generated.
  • the time series data generation unit 203 generates time series data used for concatenation by the data concatenation unit 204. More specifically, the time series data generation unit 203 generates a time series based on the measured value measured at the facility and the set value corresponding to the measured value, the element being the difference or ratio between the measured value and the set value. Generate data. The difference or ratio between the measured value and the set value is an explanatory variable in the generated inference model.
  • the time series data generation unit 203 is also not an essential configuration. However, the provision of the time-series data generation unit 203 has the advantage that variations in data caused by different facilities can be reduced. Furthermore, linking data collected at each of multiple facilities has the advantage of reducing bias in numerical fluctuations due to time series.
  • the data linking unit 204 connects a plurality of time series data based on data collected at each of a plurality of facilities to generate one pseudo time series data.
  • the method for generating pseudo time series data will be explained in the section ⁇ Example of generating pseudo time series data and training data'' below.
  • the teacher data generating unit 205 performs standardization processing or normalization processing on the pseudo time series data generated by the data linking unit 204, and uses it as teacher data.
  • the method for generating training data will also be explained in the section ⁇ Example of generation of pseudo time series data and training data'' below.
  • the learning unit 206 generates an inference model by machine learning using the teacher data generated by the teacher data generating unit 205.
  • the machine learning algorithm is not particularly limited, and for example, the learning unit 206 may generate the inference model using a support vector machine, linear regression, random forest, or neural network.
  • the information processing device 2 includes the data linking section 204 and the teacher data generating section 205.
  • the data linking unit 204 connects a plurality of time series data based on data collected for each of a plurality of objects to generate one pseudo time series data.
  • the teacher data generating unit 205 performs standardization processing or normalization processing on the pseudo time series data generated by the data linking unit 204, and uses the data as teacher data.
  • the information processing device 2 includes the data linking section 204, the teacher data generating section 205, and the learning section 206.
  • the data linking unit 204 connects a plurality of time series data based on data collected for each of a plurality of objects to generate one piece of pseudo time series data.
  • the teacher data generating unit 205 performs standardization processing or normalization processing on the pseudo time series data generated by the data linking unit 204, and uses the data as teacher data.
  • the learning unit 206 generates an inference model by machine learning using the teacher data generated by the teacher data generating unit 205.
  • an inference model can be generated using a plurality of time series data based on data collected for each of a plurality of objects. Further, according to the above configuration, it is possible to generate a highly versatile inference model that can be used not only for the facility where data is collected but also for inference at other facilities.
  • FIG. 3 is a diagram showing an example of determining connection targets.
  • steam amount sensor data 111A, temperature sensor data 112A, etc. are collected in facility A. Collection of these data is performed by, for example, the information processing device 1A shown in FIG. 2 (more specifically, the data acquisition unit 101 of the information processing device 1A).
  • facility B steam amount sensor data 111B, temperature sensor data 112B, etc. are collected
  • facility C steam amount sensor data 111C, temperature sensor data 112C, etc. are collected. Collection of these data is performed by, for example, the information processing devices 1B and 1C shown in FIG. Then, the data acquisition unit 201 of the information processing device 2 acquires each of the above data collected by the information processing devices 1A to 1C.
  • the concatenation target determining unit 202 determines the data to be concatenated by the data concatenation unit 204 from among the data thus obtained.
  • the connection target determining unit 202 may determine data to be connected according to preset rules, and may exclude unrelated data from being connected.
  • a rule may be set in which data measured by the same type of sensor is to be linked.
  • the connection target determination unit 202 determines the steam amount sensor data 111A to 111C including the measured value measured by the steam amount sensor as the connection object. Furthermore, the connection target determination unit 202 determines the temperature sensor data 112A to 112C including the measured values measured by the temperature sensors as the connection targets. Note that the concatenation target determining unit 202 may assign a common code or identification information to data determined to be concatenation targets.
  • the connection target determining unit 202 may connect data according to a rule that data measured by sensors installed at similar positions in a facility are connected.
  • the target data may be determined.
  • temperature sensor data 112A and 112C in FIG. 3 are both measured by a temperature sensor installed near the superheater of the incinerator, while temperature sensor data 112B is measured by a temperature sensor installed at another location.
  • the connection target determination unit 202 determines the temperature sensor data 112A and 112C measured by temperature sensors installed at similar positions to be the connection targets, and does not consider the temperature sensor data 112B to be the connection target.
  • FIG. 4 is a diagram showing an example of generation of time series data. More specifically, FIG. 4 shows an example in which time-series data used to generate teacher data is generated from each of the steam amount sensor data 111A to 111C shown in FIG. 3.
  • the steam amount sensor data 111A to 111C include measured values (PV: Process Variable) at each time. Further, the steam amount sensor data 111A to 111C shown in FIG. 4 also include setting values corresponding to each measured value.
  • the set value (SV: Set Variable) indicates a target value (in this example, the amount of steam) at the time associated with the set value. That is, in the facilities A to C from which the steam amount sensor data 111A to 111C were obtained, control is performed so that the steam amount approaches the set value.
  • the data acquisition unit 201 may acquire the steam amount sensor data 111A to 111C including measured values and set values.
  • the time series data generation unit 203 generates the measured value from the measured value measured at the facilities A to C and the setting value corresponding to the measured value, which is included in the steam amount sensor data 111A to 111C.
  • Time series data may be generated in which the difference or ratio with respect to the set value is used as an element.
  • the time-series data generation unit 203 generates a time-series data series from each of the steam amount sensor data 111A to 111C, in which the PV/SV value at each time, that is, the ratio between the measured value and the set value, is used as an element. Data 113A to 113C are generated. If the value of PV/SV is close to 1, it can be said that the state of the facility is normal. Note that the time series data generation unit 203 may generate time series data using the difference between PV and SV, that is, the difference between the measured value and the set value, as an element instead of PV/SV.
  • the time series data 113A to 113C include an element called an abnormality flag.
  • the value of the abnormality flag at each time indicates whether or not the status of facilities A to C is normal. Specifically, in the example of FIG. 4, the value of the abnormality flag is set to 0 when the state is normal, and the value of the abnormality flag is set to 1 when the state is not normal. Criteria for determining whether or not it is normal may be determined as appropriate. For example, the time series data generation unit 203 determines that the PV/SV value is not normal in at least one of the following cases: when the value of PV/SV is less than a predetermined lower limit value, and when the PV/SV value exceeds a predetermined upper limit value. It may be determined that the condition is normal in other cases.
  • the value of the abnormality flag is correct data in machine learning, in other words, the objective variable of the inference model to be generated.
  • the inference model generated using the time series data 113A to 113C is a model for inferring whether the state of the facility is normal or not.
  • the teacher data generation unit 205 may perform the association of objective variables.
  • the objective variable of the inference model is arbitrary and is not limited to whether the state of the facility is normal or not.
  • the time-series data generation unit 203 only needs to generate time-series data that includes any target variable that is desired to be inferred.
  • the time-series data generation unit 203 may generate time-series data including the appropriate control amount for the device.
  • the objective variable may be automatically determined by the time series data generation unit 203, or may be input by the user.
  • the steam amount sensor data 111A to 111C include a sensor name, a sensor ID, etc., or that the data types of the data included in the steam amount sensor data 111A to 111C are different.
  • the time series data generation unit 203 may format the steam amount sensor data 111A to 111C and convert them into data that can be connected.
  • the time series data generation unit 203 generates a time series based on the measured value measured at the facility and the set value corresponding to the measured value, the time series having the difference or ratio between the measured value and the set value as an element. Data may be generated for each of a plurality of facilities (objects).
  • Differences and ratios between measured values and set values are indicators that are more versatile than measured values measured at facilities. Therefore, according to the above-described configuration for generating time-series data having such versatile indicators as elements, it is possible to make the pseudo-time-series data natural. That is, according to the above configuration, it is possible to reduce variations in data caused by different facilities. Furthermore, by linking the data collected for each of multiple facilities, it is possible to reduce the bias given to fluctuations in numerical values due to time series. Therefore, according to the above configuration, it is possible to generate an inference model with high inference accuracy.
  • FIG. 5 is a diagram showing an example of generation of pseudo time series data and teacher data. More specifically, FIG. 5 shows pseudo time series data 114 generated from the time series data 113A to 113C shown in FIG. 4, and training data 115 generated from the pseudo time series data 114. There is.
  • the data concatenation unit 204 generates pseudo time series data 114 by concatenating the time series data 113A to 113C in this order. However, if the time series data 113A to 113C are simply concatenated as they are, duplication or inconsistency will occur in the "time" values.
  • the data connection unit 204 replaces the time values in the time series data 113A to 113C with continuous values of 1 to 15. These values indicate the order of each data element included in the time series data 113A to 113C, and can be called order information. By providing new order information in place of the original "time” value, it is possible to maintain data continuity and prevent inconsistencies in the "time” values.
  • the data linking unit 204 may link multiple pieces of time-series data by providing order information indicating a series of orders to the plurality of pieces of time-series data.
  • a plurality of time series data can be made into one pseudo time series data by a simple process of adding a series of order information to a plurality of time series data.
  • a highly accurate inference model can be created by standardizing or normalizing the variables.
  • statistics of the target data set are required, but the statistics are unique to the data set. Therefore, with conventional technology, it is impossible to combine time series data after standardization processing or normalization processing into a single data, or even if it is possible to combine it, learning using such data is difficult. will significantly degrade the performance of the inference model.
  • the data concatenation unit 204 may perform concatenation after performing processing to reduce discontinuity in numerical values of concatenated portions of a plurality of time series data. As a result, it is possible to reduce the discontinuity of numerical values in the connected part and give continuity to the connected pseudo time series data, and to generate pseudo time series data in which numerical changes in the connected part are natural. becomes possible.
  • discontinuity in numerical values of connected parts in multiple time series data is reduced, so it can be applied to time series data such as calculating a moving average for pseudo time series data. It becomes possible to apply common pretreatments. This can be expected to further improve the inference accuracy of the inference model.
  • the process for reducing discontinuity is not particularly limited as long as it reduces the discontinuity in numerical values of connected parts in a plurality of time series data.
  • a process for smoothing numerical values of connected parts in a plurality of time series data may be applied.
  • a process for reducing discontinuity a process may be applied in which a part of the numerical value of each joint part of the time series data to be concatenated is deleted so as to reduce the discontinuity.
  • numerical values may be deleted so that the difference between the combined parts becomes less than or equal to a threshold value.
  • the teacher data generation unit 205 generates the teacher data 115 by performing standardization processing on the pseudo time series data 114. Specifically, the teacher data generation unit 205 generates the teacher data 115 by standardizing each of the three data elements included in the pseudo time series data 114, namely, the measured value, the setting value, and PV/SV. It is said that
  • the training data generation unit 205 generates training data by performing a process for each data element of dividing the difference between the value of the data element included in the pseudo time series data 114 and the average value by the standard deviation. 115 is generated.
  • the teacher data generation unit 205 may generate the teacher data by performing normalization processing instead of standardization processing. In this case, the teacher data generation unit 205 calculates the maximum value and minimum value of the data elements included in the pseudo time series data 114, respectively. Then, the teacher data generation unit 205 performs a process for each data element of dividing the difference between the value of the data element and the calculated minimum value by the data range, that is, the difference between the maximum value and the minimum value. Generate training data by
  • the teacher data generation unit 205 generates the teacher data 115 by performing standardization processing or normalization processing on the pseudo time series data 114. This makes it possible to generate an inference model that absorbs the characteristics of time-series data for each facility, in other words, an inference model that reflects the characteristics of time-series data for each facility.
  • the teacher data generation unit 205 may notify the conditions for standardization processing or normalization processing to the information processing apparatuses 1A to 1D shown in FIG. 2 through communication via the communication unit 22.
  • the learning unit 206 performs machine learning using the teacher data 115 generated as described above to generate an inference model.
  • the objective variable of the inference model may be the value of the abnormality flag.
  • the explanatory variable of the inference model may be at least one of a measured value, a set value, and a PV/SV value.
  • the learning unit 206 may include values other than these in the explanatory variables.
  • the learning unit 206 may include, as explanatory variables, at least one of a time-series temperature measurement value, a setting value, and a PV/SV value included in the temperature sensor data 112A.
  • at least one of the values obtained by standardizing or normalizing these values may be included in the explanatory variables.
  • the inference model generated using such teacher data 115 is based on the data collected at facility D. It can also be applied to inference using data.
  • FIG. 6 is a flowchart illustrating an example of processing when the information processing device 2 generates an inference model. Although details will be described later, the series of processes shown in the flowchart of FIG. 6 includes a method of generating teacher data and a method of generating an inference model.
  • the data acquisition unit 201 acquires data obtained at each facility.
  • the data acquisition unit 201 may acquire time-series sensing data collected at the facilities A to C shown in FIG. 2, respectively, via the information processing devices 1A to 1C.
  • the data collected at each facility may be stored in advance in the storage unit 21 or the like, and in this case, the data acquisition unit 201 may acquire the data from the storage unit 21 or the like.
  • connection target determination unit 202 determines the data to be connected by the data connection unit 204 from among the data acquired in S11. As described based on FIG. 3, the connection target determination unit 202 may determine data to be connected according to preset rules, and may exclude unrelated data from being connected.
  • the time-series data generation unit 203 generates time-series data to be used for the connection by the data connection unit 204 from the data determined to be the object of connection in S12. Specifically, the time series data generation unit 203 calculates the difference between the measured value and the set value from the measured value included in the data determined to be linked in S12 and the set value corresponding to the measured value. Or generate time series data with ratio as an element. This process can also be said to be a process of generating explanatory variables for an inference model using measured values. Further, the time series data generation unit 203 may also perform a process of associating the generated time series data with a value that is an objective variable of the inference model in S13.
  • the data concatenation unit 204 concatenates a plurality of time series data based on data collected at each of a plurality of facilities to generate one pseudo time series data.
  • the teacher data generation unit 205 performs standardization processing or normalization processing on the pseudo time series data generated in S14, and uses it as teacher data. If the time-series data is not associated with the value to be the target variable in S13, the teacher data generation unit 205 associates the value with the pseudo-time-series data in S15.
  • the learning unit 206 In S16 (learning step), the learning unit 206 generates an inference model by machine learning using the teacher data generated in S15, and thus the process in FIG. 6 ends. Note that the learning unit 206 may transmit the generated inference model to the information processing device 1. Furthermore, at this time, the statistical amount used in the standardization process or normalization process in S15 may also be transmitted.
  • the process of FIG. 6 described above includes a method of generating teacher data. That is, the teacher data generation method executed by the information processing device 2 includes a data linking step (S14) and a teacher data generation step (S15).
  • the data linking step (S14) connects a plurality of time series data based on data collected for each of a plurality of facilities (objects) to generate one pseudo time series data.
  • the teacher data generation step (S15) the pseudo time series data generated in S14 is subjected to standardization processing or normalization processing to become teacher data. Thereby, it is possible to generate training data that can generate a highly versatile inference model.
  • the process in FIG. 6 also includes a method for generating an inference model. That is, the inference model generation method executed by the information processing device 2 includes a data linking step (S14), a teacher data generation step (S15), and a learning step (S16).
  • the data linking step (S14) connects a plurality of time series data based on data collected for each of a plurality of facilities (objects) to generate one pseudo time series data.
  • the teacher data generation step (S15) the pseudo time series data generated in S14 is subjected to standardization processing or normalization processing to become teacher data.
  • the learning step (S16) an inference model is generated by machine learning using the teacher data generated in S15. Therefore, it is possible to generate a highly versatile inference model.
  • the data acquisition unit 101 of the information processing device 1 acquires the inference model generated in S16 and the statistics used in the standardization process or normalization process in S15. Furthermore, the data acquisition unit 101 acquires inference data collected about a facility that is a target of inference. For example, in the case of the information processing apparatus 1D in FIG. 2, inference data collected at the facility D is acquired.
  • the preprocessing unit 102 performs standardization processing or normalization processing on the inference data using the above-mentioned statistics to generate input data for the inference model.
  • the inference unit 103 inputs the input data generated by the preprocessing unit 102 to the inference model, and obtains an inference result based on the output value output by the inference model.
  • the control amount determination unit 104 determines the control amount for the equipment installed in the target facility based on the above inference result.
  • the information processing device 2 connects the time series data collected for each facility to form one pseudo time series data, and then performs standardization processing or normalization processing. . Therefore, standardization processing or normalization processing can be performed on inference data collected at any facility using the same statistical amount.
  • an information processing device 2 different from the information processing devices 1A to 1D generates teacher data and an inference model, but any of the information processing devices 1A to 1D Alternatively, the teacher data and the inference model may be generated. Furthermore, in the information processing system 5 in FIG. 2, the information processing devices 1A to 1D collect data in the facilities A to D and perform inference using an inference model, but these processes are performed using separate information. It may be executed by a processing device.
  • one information processing device 2 generates the teacher data and the inference model, but the generation of the teacher data and the generation of the inference model are performed using different information. It can also be executed by a processing device.
  • the execution entity of each process described in FIG. 6 does not necessarily have to be one device, and the processing can be shared and executed by a plurality of arbitrary information processing devices (computers).
  • the information processing device 2 may execute the processing of S11 to S15 in FIG. 6, and the processing of S16 may be executed by another information processing device.
  • the functions of the information processing devices 1 and 2 are programs for making a computer function as the devices, and each control block of the devices (particularly each unit included in the control units 10 and 20). ) can be realized by a program for making a computer function.
  • the inference model generation function in the information processing device 2 can be realized by an inference model generation program
  • the teacher data generation function in the information processing device 2 can be realized by a teacher data generation program.
  • the device includes a computer having at least one control device (for example, a processor) and at least one storage device (for example, a memory) as hardware for executing the program.
  • control device for example, a processor
  • storage device for example, a memory
  • the above program may be recorded on one or more computer-readable recording media instead of temporary.
  • This recording medium may or may not be included in the above device. In the latter case, the program may be supplied to the device via any transmission medium, wired or wireless.
  • each of the control blocks described above can also be realized by a logic circuit.
  • a logic circuit for example, an integrated circuit in which a logic circuit functioning as each of the control blocks described above is formed is also included in the scope of the present invention.
  • the information processing device includes: a data linking unit that connects a plurality of time series data based on data collected for each of a plurality of objects to generate one pseudo time series data;
  • the present invention includes a teacher data generation section that performs standardization processing or normalization processing on the standard time series data to produce teacher data, and a learning section that generates an inference model by machine learning using the teacher data.
  • the data linking unit connects the plurality of time series data by providing order information indicating a series of orders to the plurality of time series data.
  • the configuration may be such that
  • the data concatenation unit performs the process of reducing numerical discontinuity of the concatenated portions in the plurality of time series data, and then concatenates the data. It may also be configured to perform the following.
  • the measurement value and the setting value are determined from the measurement value measured for the object and the setting value corresponding to the measurement value.
  • the apparatus further includes a time-series data generation unit that generates the time-series data for each of the plurality of objects, the time-series data having a difference or ratio as an element.
  • An information processing device includes: a data linking unit that connects a plurality of time series data based on data collected for each of a plurality of objects to generate one pseudo time series data; and a teacher data generation unit that performs standardization processing or normalization processing on the historical time series data and generates teacher data.
  • a generation method is an inference model generation method executed by one or more information processing devices, which connects a plurality of time series data based on data collected for each of a plurality of objects.
  • a data concatenation step of generating one pseudo time series data using the above data a teacher data generation step of performing standardization processing or normalization processing on the pseudo time series data to obtain training data, and a step of generating training data using the training data.
  • a generation method is a training data generation method executed by one or more information processing devices, which connects a plurality of time series data based on data collected for each of a plurality of objects.
  • the method includes a data concatenation step of generating one piece of pseudo time series data, and a teacher data generation step of subjecting the pseudo time series data to standardization processing or normalization processing to obtain teacher data.
  • the inference model generation program according to aspect 8 of the present invention is an inference model generation program for causing a computer to function as the information processing apparatus according to aspect 1, and includes the data linking section, the teacher data generation section, and the inference model generation program. Make the computer function as a learning department.
  • a teacher data generation program according to aspect 9 of the present invention is a teacher data generation program for causing a computer to function as the information processing apparatus according to aspect 5, wherein the computer is used as the data linking section and the teacher data generation section. Make it work.
  • Time series data generation section 204 Data connection section 205
  • Teacher data generation section 206 Learning section 111A, 111B, 111C Steam amount sensor data (measured value, set value) 112A, 112B, 112C Temperature sensor data (measured value, set value) 113A, 113B Time series data 114 Pseudo time series data 115 Training data

Abstract

The present invention enables generation of a highly versatile inference model. An information processing device (2) comprises: a data combination unit (204) that combines a plurality of pieces of time series data, which are based on pieces of data collected from a plurality of facilities respectively, to generate pseudo time series data; a training data generation unit (205) that applies standardization processing to the pseudo time series data to obtain training data; and a training unit (206) that generates an inference model by machine learning using the training data.

Description

情報処理装置、推論モデルの生成方法、教師データの生成方法、推論モデル生成プログラム、および教師データ生成プログラムInformation processing device, inference model generation method, teacher data generation method, inference model generation program, and teacher data generation program
 本発明は、推論モデルを生成する情報処理装置等に関する。 The present invention relates to an information processing device, etc. that generates an inference model.
 機械学習により推論モデルを生成する技術が従来から知られている。例えば、下記の特許文献1には、プラントで取得された複数のプラントデータから当該プラントで発生した事象の種別を推論するニューラルネットワークモデルが開示されている。なお、上記「複数のプラントデータ」とは、1つのプラントで収集された時系列のデータである。 Technology for generating inference models using machine learning has been known for a long time. For example, Patent Document 1 listed below discloses a neural network model that infers the type of event occurring in a plant from a plurality of plant data acquired in the plant. Note that the above-mentioned "multiple plant data" is time-series data collected at one plant.
日本国特許第3002524号Japanese Patent No. 3002524
 上述のような従来技術には推論モデルの汎用性という点で改善の余地がある。すなわち、特許文献1の技術では、対象となるプラントにおいて収集されたプラントデータを用いて機械学習を行う。このような学習により生成された推論モデルでは、そのプラントにおいて収集されたプラントデータから当該プラントで発生した事象の種別を高精度に推論することが可能である。しかし、当該推論モデルでは、他のプラントで収集されたプラントデータから当該他のプラントで発生した事象の種別を推論することはできない。あるいは推論できたとしても、当該推論モデルの精度は低いものとなってしまう。 There is room for improvement in the conventional techniques described above in terms of the versatility of the inference model. That is, in the technique of Patent Document 1, machine learning is performed using plant data collected in a target plant. With the inference model generated through such learning, it is possible to highly accurately infer the type of event that occurred in the plant from the plant data collected in the plant. However, with this inference model, it is not possible to infer the type of event that occurred at another plant from the plant data collected at that other plant. Or even if inference is possible, the accuracy of the inference model will be low.
 本発明の一態様は、汎用性の高い推論モデルの生成を可能にする情報処理装置等を提供することを目的とする。 An object of one aspect of the present invention is to provide an information processing device and the like that can generate a highly versatile inference model.
 上記の課題を解決するために、本発明の一態様に係る情報処理装置は、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結部と、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成部と、前記教師データを用いた機械学習により推論モデルを生成する学習部と、を備える。 In order to solve the above problems, an information processing device according to one aspect of the present invention connects a plurality of time series data based on data collected for each of a plurality of targets to create one pseudo time series data. a data linking unit that generates data, a teacher data generation unit that performs standardization processing or normalization processing on the pseudo time series data and uses it as training data, and learning that generates an inference model by machine learning using the training data. It is equipped with a section and a section.
 また、本発明の一態様に係る他の情報処理装置は、上記の課題を解決するために、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結部と、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成部と、を備える。 In addition, in order to solve the above problem, another information processing device according to one aspect of the present invention connects a plurality of time-series data based on data collected for each of a plurality of objects to create one pseudo The data linking section generates time series data, and a teacher data generating section performs standardization processing or normalization processing on the pseudo time series data to produce teacher data.
 また、本発明の一態様に係る推論モデルの生成方法は、上記の課題を解決するために、1または複数の情報処理装置が実行する推論モデルの生成方法であって、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結ステップと、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成ステップと、前記教師データを用いた機械学習により推論モデルを生成する学習ステップと、を含む。 Further, in order to solve the above problem, an inference model generation method according to one aspect of the present invention is an inference model generation method that is executed by one or more information processing devices, and is performed for each of a plurality of targets. A data concatenation step of concatenating a plurality of time series data based on the collected data to generate one pseudo time series data, and performing standardization processing or normalization processing on the pseudo time series data to create training data. and a learning step of generating an inference model by machine learning using the teacher data.
 また、本発明の一態様に係る教師データの生成方法は、上記の課題を解決するために、1または複数の情報処理装置が実行する教師データの生成方法であって、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結ステップと、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成ステップと、を含む。 Further, in order to solve the above problem, a method for generating teacher data according to one aspect of the present invention is a method for generating teacher data executed by one or more information processing apparatuses, and for each of a plurality of targets. A data concatenation step of concatenating a plurality of time series data based on the collected data to generate one pseudo time series data, and performing standardization processing or normalization processing on the pseudo time series data to create training data. and a step of generating training data.
 本発明の一態様によれば、汎用性の高い推論モデルを生成することが可能になる。 According to one aspect of the present invention, it is possible to generate a highly versatile inference model.
本発明の一実施形態に係る情報処理装置の要部構成の一例を示すブロック図である。FIG. 1 is a block diagram showing an example of the configuration of main parts of an information processing device according to an embodiment of the present invention. 上記情報処理装置を含む情報処理システムの構成を示す図である。FIG. 1 is a diagram showing the configuration of an information processing system including the information processing device. 連結対象の決定例を示す図である。FIG. 7 is a diagram illustrating an example of determining connection targets. 時系列データの生成例を示す図である。FIG. 3 is a diagram showing an example of generation of time series data. 擬似的時系列データおよび教師データの生成例を示す図である。FIG. 3 is a diagram showing an example of generation of pseudo time series data and teacher data. 上記情報処理装置が推論モデルを生成する際の処理の一例を示すフローチャートである。FIG. 2 is a flowchart illustrating an example of a process when the information processing device generates an inference model. FIG.
 (システム構成)
 図2は、本実施形態に係る情報処理システム5の構成を示す図である。図示のように、情報処理システム5には、情報処理装置1A~1Dと情報処理装置2とが含まれている。詳細は以下説明するが、情報処理システム5によれば、出所が異なる複数の時系列データを利用して推論モデルを生成することができる。そして、情報処理システム5によって生成される推論モデルは、単一の出所の時系列データを教師データとして生成された推論モデルと比べて汎用性の高い推論モデルとなる。
(System configuration)
FIG. 2 is a diagram showing the configuration of the information processing system 5 according to this embodiment. As shown in the figure, the information processing system 5 includes information processing devices 1A to 1D and an information processing device 2. Although details will be described below, the information processing system 5 can generate an inference model using a plurality of time series data from different sources. The inference model generated by the information processing system 5 is more versatile than the inference model generated using time series data from a single source as training data.
 情報処理装置1A~1Dは、上述の推論モデルを生成するための教師データの元になる時系列データを収集する装置であり、施設A~Dにそれぞれ配置されている。以下では、情報処理装置1A~1Dのそれぞれを区別する必要がないときには、単に「情報処理装置1」と表記する。 The information processing devices 1A to 1D are devices that collect time-series data that is the source of training data for generating the above-mentioned inference model, and are located in facilities A to D, respectively. In the following, when there is no need to distinguish between the information processing apparatuses 1A to 1D, they will simply be referred to as "information processing apparatus 1."
 施設A~Dは、少なくとも1つの機器を備えたものであればよい。例えば、施設A~Dは、複数の機器を備えたプラントであってもよい。ここで「プラント」は、産業的に使用される設備であり、複数の機器を備えている。当該「プラント」は、それらの機器により製品の生産あるいは対象物の処理といった所定の処理を行うものである。 Facilities A to D may be equipped with at least one piece of equipment. For example, facilities A to D may be plants equipped with multiple pieces of equipment. Here, a "plant" is an industrially used facility, and is equipped with a plurality of devices. The "plant" uses these devices to perform predetermined processing such as production of products or processing of objects.
 以下では、施設A~Dが、廃棄物(例えば可燃ごみ)を焼却し、その排熱を利用して発電を行う廃棄物処理施設である例を説明する。この場合、情報処理装置1Aは、施設Aにおける廃棄物の焼却や発電に関する時系列データを収集する。同様に、情報処理装置1B~1Dは、施設B~Dにおける廃棄物の焼却や発電に関する時系列データを収集する。 In the following, an example will be described in which facilities A to D are waste treatment facilities that incinerate waste (for example, combustible garbage) and generate electricity using the waste heat. In this case, the information processing device 1A collects time-series data regarding waste incineration and power generation at the facility A. Similarly, the information processing devices 1B to 1D collect time-series data regarding waste incineration and power generation at the facilities B to D.
 時系列データは、推論モデルを生成するための教師データの元になるものであればよく、推論内容等に応じたものとすればよい。例えば、焼却炉の燃焼状態を予測する推論モデルを生成する場合、焼却炉の燃焼状態に関連する各種の時系列データを収集すればよい。具体例を挙げれば、炉内温度の時系列の測定値や、ボイラにおける発生蒸気量の時系列の測定値といった、自動燃焼制御(ACC:Automatic Combustion Control)に関わる時系列のセンシングデータを時系列データとして収集してもよい。 The time-series data may be the source of the training data for generating the inference model, and may be in accordance with the content of the inference. For example, when generating an inference model that predicts the combustion state of an incinerator, it is sufficient to collect various time-series data related to the combustion state of the incinerator. To give a specific example, time-series sensing data related to automatic combustion control (ACC), such as time-series measurements of furnace temperature and time-series measurements of the amount of steam generated in a boiler, are It may also be collected as data.
 情報処理装置2は、情報処理装置1A~1Dにより収集される時系列データを連結して1つの時系列データを生成する。そして、情報処理装置2は、連結した時系列データに対して標準化処理または正規化処理を施して教師データとする。情報処理装置2は、当該教師データを用いた機械学習により推論モデルを生成する。 The information processing device 2 connects the time series data collected by the information processing devices 1A to 1D to generate one time series data. Then, the information processing device 2 performs standardization processing or normalization processing on the connected time series data and uses it as training data. The information processing device 2 generates an inference model by machine learning using the teacher data.
 このようにして生成された推論モデルは、単一の出所の時系列データを教師データとする場合と比べて汎用性の高いものとなる。例えば、施設A~Dを出所とする各時系列データから生成された教師データを用いて生成された推論モデルは、施設A~Dの何れにおける推論にも利用できる上、同種の施設であれば施設A~D以外の施設における推論にも利用できる。このように、情報処理装置2による推論モデルの生成においては、推論の対象となる施設で収集されたデータを必ずしも必要としない。そのため、例えば新規施設が建設された場合に、その稼働開始直後から当該推論モデルを使用することも可能である。また、必ずしも情報処理装置1A~1Dの全てが教師データの元になる時系列データを収集する必要もない。例えば、施設A~Cからは教師データの元になる時系列データを収集する一方、施設Dからは教師データの元になる時系列データを収集せずに推論モデルを生成することもできる。そして、この推論モデルは、施設Dにおける推論にも利用可能である。 The inference model generated in this way is more versatile than when time series data from a single source is used as training data. For example, an inference model generated using training data generated from each time-series data sourced from facilities A to D can be used for inference at any of facilities A to D, and if the same type of facility It can also be used for reasoning in facilities other than facilities A to D. In this way, the generation of the inference model by the information processing device 2 does not necessarily require data collected at the facility that is the target of inference. Therefore, for example, when a new facility is constructed, the inference model can be used immediately after the facility starts operating. Further, it is not necessarily necessary that all of the information processing devices 1A to 1D collect time-series data that is the source of teacher data. For example, it is possible to collect time-series data from facilities A to C, which is the source of the teacher data, while generating an inference model without collecting time-series data, which is the source of the teacher data, from facility D. This inference model can also be used for inference at facility D.
 また、一般に、十分な推定精度の推定モデルを生成するためには、十分な数の教師データが必要となり、各施設におけるデータの収集期間も長期化しがちである。この点、情報処理システム5では複数の施設で収集された時系列データから教師データを生成するので、短期間で必要な数の教師データを揃えて速やかに推論モデルを生成することができる。 Additionally, in general, in order to generate an estimation model with sufficient estimation accuracy, a sufficient amount of teacher data is required, and the data collection period at each facility tends to be long. In this regard, since the information processing system 5 generates teacher data from time-series data collected at a plurality of facilities, it is possible to collect the required number of teacher data in a short period of time and quickly generate an inference model.
 なお、連結する時系列データは、それぞれ異なる複数の施設で収集されたものに限られない。例えば、情報処理装置2は、1つの施設における異なる複数の設備あるいは機器について収集された時系列データを連結することもできる。例えば、上述の施設Aが2基の焼却炉を備える場合、情報処理装置2は、各焼却炉で収集された時系列データを連結し、それら2基の焼却炉の何れにおける推論にも利用可能な推論モデルを生成することができる。このため、以下の説明における「施設」は、任意の「対象」に読み替えることができる。 Note that the time-series data to be linked is not limited to data collected at multiple different facilities. For example, the information processing device 2 can also connect time-series data collected about a plurality of different pieces of equipment or equipment in one facility. For example, if the above-mentioned facility A is equipped with two incinerators, the information processing device 2 can connect the time-series data collected at each incinerator and use it for inference in any of the two incinerators. It is possible to generate an inference model. Therefore, "facility" in the following description can be replaced with any "object".
 また、時系列データの収集の方法も任意である。例えば、任意の対象自体に設置したセンサで測定した測定値を時系列データとして収集してもよい。また、任意の対象の周囲に設置したセンサで測定した測定値や、気温、湿度等のより広い範囲を対象として計測されたデータを時系列データとして収集してもよい。この他にも、例えば、任意の対象の動作設定のための設定値や、任意の対象に所定の動作を実行させる際の指令値等を時系列データとして収集してもよい。 Additionally, the method of collecting time-series data is also arbitrary. For example, measured values measured by a sensor installed on an arbitrary object itself may be collected as time-series data. Furthermore, measured values measured by sensors installed around an arbitrary object, or data measured over a wider range of objects, such as temperature and humidity, may be collected as time-series data. In addition to this, for example, setting values for setting the operation of an arbitrary object, command values for causing an arbitrary object to execute a predetermined operation, etc. may be collected as time-series data.
 (装置構成)
 図1に基づいて情報処理装置1および2の構成を説明する。図1は、情報処理装置1および2の要部構成の一例を示すブロック図である。図示のように、情報処理装置1は、情報処理装置1の各部を統括して制御する制御部10と、情報処理装置1が使用する各種データを記憶する記憶部11を備えている。また、情報処理装置1は、情報処理装置1が他の装置と通信するための通信部12、情報処理装置1に対する各種データの入力を受け付ける入力部13、および情報処理装置1が各種データを出力するための出力部14を備えている。また、制御部10には、データ取得部101、前処理部102、推論部103、および制御量決定部104が含まれている。
(Device configuration)
The configurations of information processing devices 1 and 2 will be explained based on FIG. 1. FIG. 1 is a block diagram showing an example of the configuration of main parts of information processing apparatuses 1 and 2. As shown in FIG. As illustrated, the information processing device 1 includes a control section 10 that centrally controls each section of the information processing device 1, and a storage section 11 that stores various data used by the information processing device 1. The information processing device 1 also includes a communication unit 12 for the information processing device 1 to communicate with other devices, an input unit 13 that receives input of various data to the information processing device 1, and an input unit 13 for the information processing device 1 to output various data. It is equipped with an output section 14 for outputting data. The control unit 10 also includes a data acquisition unit 101, a preprocessing unit 102, an inference unit 103, and a control amount determining unit 104.
 データ取得部101は、推論モデルを生成するための教師データの元になる時系列データを取得する。時系列データは、施設に配置されたセンサ等から通信部12または入力部13を介して取得されてもよいし、情報処理装置1のユーザが入力部13を介して入力してもよい。そして、データ取得部101は、取得したデータを、通信部12経由で情報処理装置2に送信する。 The data acquisition unit 101 acquires time-series data that is the source of training data for generating an inference model. The time series data may be acquired from a sensor or the like placed in the facility via the communication unit 12 or the input unit 13, or may be input by the user of the information processing device 1 via the input unit 13. The data acquisition unit 101 then transmits the acquired data to the information processing device 2 via the communication unit 12.
 また、データ取得部101は、情報処理装置2に送信した上記のデータを用いて生成された推論モデルを取得する。推論モデルは、通信部12を介した通信により情報処理装置2から取得されてもよいし、情報処理装置1のユーザにより入力部13を介して入力されてもよい。 Additionally, the data acquisition unit 101 acquires an inference model generated using the above data transmitted to the information processing device 2. The inference model may be acquired from the information processing device 2 through communication via the communication unit 12, or may be input by the user of the information processing device 1 via the input unit 13.
 さらに、データ取得部101は、取得した推論モデルを用いて推論を行う際に、推論用のデータを取得する。以下では当該データを推論用データと呼ぶ。推論用データは、推論の対象となる施設に配置されたセンサ等から通信部12または入力部13を介して入力されてもよいし、情報処理装置1のユーザにより入力部13を介して入力されてもよい。 Further, the data acquisition unit 101 acquires data for inference when performing inference using the acquired inference model. In the following, this data will be referred to as inference data. The inference data may be input via the communication unit 12 or the input unit 13 from a sensor or the like placed in a facility to be inferred, or may be input via the input unit 13 by a user of the information processing device 1. You can.
 前処理部102は、データ取得部101が取得する推論用データに前処理を施し、上記の推論モデルに入力する入力データを生成する。詳細は後述するが、上記の前処理は、情報処理装置2で推論モデルの教師データの生成時と同じ条件を適用して推論用データを標準化処理または正規化処理する処理である。 The preprocessing unit 102 performs preprocessing on the inference data acquired by the data acquisition unit 101, and generates input data to be input to the above inference model. Although the details will be described later, the above pre-processing is a process of standardizing or normalizing the inference data by applying the same conditions as when the information processing device 2 generates the training data of the inference model.
 推論部103は、情報処理装置2により生成される推論モデルを用いて推論を行う。より詳細には、推論部103は、前処理部102が生成する入力データを上記推論モデルに入力することにより得られる出力値をそのまま推論結果とするか、または当該出力値に基づいて推論結果を得る。 The inference unit 103 performs inference using the inference model generated by the information processing device 2. More specifically, the inference unit 103 directly uses the output value obtained by inputting the input data generated by the preprocessing unit 102 into the inference model as the inference result, or generates the inference result based on the output value. obtain.
 推論モデルの推論対象は特に限定されず、例えば図2に示した施設A~Dが備える焼却炉における燃焼状態を予測するものとしてもよい。施設A~Dにおいて効率的に発電するためには、発電用のタービンを回すための蒸気量を安定させる必要があるが、焼却する廃棄物の質や量などの変動に伴って焼却炉内の燃焼状態が変化し、これに伴って蒸気量も変化する。情報処理装置1A~1Dは、燃焼状態を予測する推論モデルを用いることにより、施設A~Dが備える焼却炉における燃焼状態を予測する。情報処理装置1A~1Dは、その予測結果に応じて施設A~D内の機器を適切に制御して、廃棄物の焼却と発電とを安定的に行うことが可能になる。 The inference target of the inference model is not particularly limited, and may predict the combustion state in the incinerators provided in the facilities A to D shown in FIG. 2, for example. In order to efficiently generate electricity at Facilities A to D, it is necessary to stabilize the amount of steam used to turn the power generation turbines, but due to fluctuations in the quality and quantity of waste to be incinerated, The combustion state changes, and the amount of steam changes accordingly. The information processing devices 1A to 1D predict the combustion state in the incinerators provided in the facilities A to D by using an inference model that predicts the combustion state. The information processing devices 1A to 1D can appropriately control the equipment in the facilities A to D according to the prediction results, and can stably perform waste incineration and power generation.
 制御量決定部104は、推論部103の推論結果に基づいて、対象となる施設に設けられた機器に対する制御量を決定する。制御量の決定方法はどのような推論モデルを用いて推論が行われたかによって異なる。例えば、焼却炉のボイラから発生する蒸気量を予測する推論モデルを用いた推論により、発生蒸気量が減少するとの推論結果が得られたとする。この場合、制御量決定部104は、ボイラからの発生蒸気量に影響を与える機器(例えば焼却炉内に供給する空気の量や火格子の動作速度)の制御量を決定してもよい。無論、この場合、発生蒸気量が増えるように制御量が決定される。例えば、焼却炉内に供給する空気の量を増やす、あるいは火格子の動作速度を上げる制御が行われるように制御量が決定される。 The control amount determination unit 104 determines the control amount for the equipment installed in the target facility based on the inference result of the inference unit 103. The method for determining the control amount differs depending on what kind of inference model was used to perform the inference. For example, suppose that an inference result using an inference model that predicts the amount of steam generated from the boiler of an incinerator indicates that the amount of steam generated will decrease. In this case, the control amount determination unit 104 may determine the control amount of equipment that affects the amount of steam generated from the boiler (for example, the amount of air supplied to the incinerator and the operating speed of the grate). Of course, in this case, the control amount is determined so that the amount of generated steam increases. For example, the control amount is determined to increase the amount of air supplied into the incinerator or to increase the operating speed of the grate.
 一方、情報処理装置2は、情報処理装置2の各部を統括して制御する制御部20と、情報処理装置2が使用する各種データを記憶する記憶部21を備えている。また、情報処理装置2は、情報処理装置2が他の装置と通信するための通信部22、情報処理装置2に対する各種データの入力を受け付ける入力部23、および情報処理装置2が各種データを出力するための出力部24を備えている。また、制御部20には、データ取得部201、連結対象決定部202、時系列データ生成部203、データ連結部204、教師データ生成部205、および学習部206が含まれている。 On the other hand, the information processing device 2 includes a control section 20 that centrally controls each section of the information processing device 2, and a storage section 21 that stores various data used by the information processing device 2. The information processing device 2 also includes a communication unit 22 for the information processing device 2 to communicate with other devices, an input unit 23 that receives input of various data to the information processing device 2, and an input unit 23 for the information processing device 2 to output various data. It is equipped with an output section 24 for outputting data. The control unit 20 also includes a data acquisition unit 201, a connection target determination unit 202, a time series data generation unit 203, a data connection unit 204, a teacher data generation unit 205, and a learning unit 206.
 データ取得部201は、教師データの元になるデータを取得する。データ取得部201が取得するデータには、生成する推定モデルの説明変数となるデータまたは説明変数の元になるデータが含まれていればよい。例えば、施設A~Dのそれぞれで収集された時系列のセンシングデータを説明変数とする推定モデルを生成する場合、データ取得部201は、通信部22を介して図2に示される情報処理装置1A~1Dと通信して当該センシングデータを取得してもよい。 The data acquisition unit 201 acquires data that is the source of teacher data. The data acquired by the data acquisition unit 201 only needs to include data that becomes an explanatory variable of the estimated model to be generated or data that becomes the source of the explanatory variable. For example, when generating an estimation model using time-series sensing data collected at each of facilities A to D as an explanatory variable, the data acquisition unit 201 transmits data to the information processing device 1A shown in FIG. 2 via the communication unit 22. ~1D may be communicated with to obtain the sensing data.
 連結対象決定部202は、データ取得部201が取得するデータの中からデータ連結部204による連結の対象とするものを決定する。連結対象決定部202は、必須の構成ではない。ただし、連結対象決定部202を備えていることにより、データ取得部201が取得するデータの中に連結に適さないものが含まれている場合や、連結できない組み合わせがある場合でも妥当な教師データを生成することが可能になるという利点がある。 The concatenation target determining unit 202 determines the data to be concatenated by the data concatenation unit 204 from among the data acquired by the data acquisition unit 201. The connection target determination unit 202 is not an essential configuration. However, by including the connection target determination unit 202, even if the data acquired by the data acquisition unit 201 includes data that is not suitable for connection, or if there are combinations that cannot be connected, appropriate training data can be obtained. It has the advantage that it can be generated.
 時系列データ生成部203は、データ連結部204による連結に用いられる時系列データを生成する。より詳細には、時系列データ生成部203は、施設で測定された測定値と当該測定値に対応する設定値とから、当該測定値と当該設定値との差または比を要素とする時系列データを生成する。測定値と設定値との差または比は、生成される推論モデルにおける説明変数である。 The time series data generation unit 203 generates time series data used for concatenation by the data concatenation unit 204. More specifically, the time series data generation unit 203 generates a time series based on the measured value measured at the facility and the set value corresponding to the measured value, the element being the difference or ratio between the measured value and the set value. Generate data. The difference or ratio between the measured value and the set value is an explanatory variable in the generated inference model.
 時系列データ生成部203も必須の構成ではない。ただし、時系列データ生成部203を備えていることにより、施設が異なることを原因として発生するデータのばらつきを減少させることができるという利点がある。さらに、複数の施設のそれぞれで収集されたデータを連結することが、時系列性による数値の変動に与えるバイアスを減少させることができるという利点がある。 The time series data generation unit 203 is also not an essential configuration. However, the provision of the time-series data generation unit 203 has the advantage that variations in data caused by different facilities can be reduced. Furthermore, linking data collected at each of multiple facilities has the advantage of reducing bias in numerical fluctuations due to time series.
 データ連結部204は、複数の施設のそれぞれで収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成する。擬似的時系列データの生成方法については後記「擬似的時系列データおよび教師データの生成例」の項目で説明する。 The data linking unit 204 connects a plurality of time series data based on data collected at each of a plurality of facilities to generate one pseudo time series data. The method for generating pseudo time series data will be explained in the section ``Example of generating pseudo time series data and training data'' below.
 教師データ生成部205は、データ連結部204により生成された擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする。教師データの生成方法についても後記「擬似的時系列データおよび教師データの生成例」の項目で説明する。 The teacher data generating unit 205 performs standardization processing or normalization processing on the pseudo time series data generated by the data linking unit 204, and uses it as teacher data. The method for generating training data will also be explained in the section ``Example of generation of pseudo time series data and training data'' below.
 学習部206は、教師データ生成部205が生成する教師データを用いた機械学習により推論モデルを生成する。機械学習のアルゴリズムは特に限定されず、例えば学習部206は、サポートベクターマシン、線形回帰、ランダムフォレスト、またはニューラルネットワークにより推論モデルを生成してもよい。 The learning unit 206 generates an inference model by machine learning using the teacher data generated by the teacher data generating unit 205. The machine learning algorithm is not particularly limited, and for example, the learning unit 206 may generate the inference model using a support vector machine, linear regression, random forest, or neural network.
 以上のように、情報処理装置2は、データ連結部204と教師データ生成部205とを備えている。データ連結部204は複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成する。教師データ生成部205はデータ連結部204が生成する擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする。 As described above, the information processing device 2 includes the data linking section 204 and the teacher data generating section 205. The data linking unit 204 connects a plurality of time series data based on data collected for each of a plurality of objects to generate one pseudo time series data. The teacher data generating unit 205 performs standardization processing or normalization processing on the pseudo time series data generated by the data linking unit 204, and uses the data as teacher data.
 上記の構成によれば、汎用性の高い推論モデルを生成することが可能な教師データを生成することができる。よって、上記の構成によれば、汎用性の高い推論モデルを提供することが可能になる。 According to the above configuration, it is possible to generate training data that can generate a highly versatile inference model. Therefore, according to the above configuration, it is possible to provide a highly versatile inference model.
 また、以上のように、情報処理装置2は、データ連結部204と、教師データ生成部205と、学習部206と、を備えている。データ連結部204は、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成する。教師データ生成部205はデータ連結部204が生成する擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする。学習部206は教師データ生成部205が生成する教師データを用いた機械学習により推論モデルを生成する。 Furthermore, as described above, the information processing device 2 includes the data linking section 204, the teacher data generating section 205, and the learning section 206. The data linking unit 204 connects a plurality of time series data based on data collected for each of a plurality of objects to generate one piece of pseudo time series data. The teacher data generating unit 205 performs standardization processing or normalization processing on the pseudo time series data generated by the data linking unit 204, and uses the data as teacher data. The learning unit 206 generates an inference model by machine learning using the teacher data generated by the teacher data generating unit 205.
 上記の構成によれば、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを利用して推論モデルを生成することができる。また、上記の構成によれば、データの収集の対象となった施設のみならず、他の施設における推論にも利用可能な汎用性の高い推論モデルを生成することができる。 According to the above configuration, an inference model can be generated using a plurality of time series data based on data collected for each of a plurality of objects. Further, according to the above configuration, it is possible to generate a highly versatile inference model that can be used not only for the facility where data is collected but also for inference at other facilities.
 (連結対象の決定例)
 図3は、連結対象の決定例を示す図である。図3の例では、施設Aにおいて、蒸気量センサデータ111Aおよび温度センサデータ112A等が収集されている。これらのデータの収集は、例えば図2に示した情報処理装置1A(より詳細には情報処理装置1Aのデータ取得部101)により行われる。同様に、施設Bでは蒸気量センサデータ111Bおよび温度センサデータ112B等が収集され、施設Cでは蒸気量センサデータ111Cおよび温度センサデータ112C等が収集されている。これらのデータの収集は、例えば図2に示した情報処理装置1Bおよび1Cにより行われる。そして、情報処理装置2のデータ取得部201が、情報処理装置1A~1Cにより収集される上記の各データを取得する。
(Example of determining consolidation targets)
FIG. 3 is a diagram showing an example of determining connection targets. In the example of FIG. 3, in facility A, steam amount sensor data 111A, temperature sensor data 112A, etc. are collected. Collection of these data is performed by, for example, the information processing device 1A shown in FIG. 2 (more specifically, the data acquisition unit 101 of the information processing device 1A). Similarly, in facility B, steam amount sensor data 111B, temperature sensor data 112B, etc. are collected, and in facility C, steam amount sensor data 111C, temperature sensor data 112C, etc. are collected. Collection of these data is performed by, for example, the information processing devices 1B and 1C shown in FIG. Then, the data acquisition unit 201 of the information processing device 2 acquires each of the above data collected by the information processing devices 1A to 1C.
 連結対象決定部202は、このようにして取得されたデータの中からデータ連結部204による連結の対象とするものを決定する。例えば、連結対象決定部202は、予め設定されたルールに従って連結の対象とするデータを決定し、関連性のないデータは連結の対象としないようにしてもよい。 The concatenation target determining unit 202 determines the data to be concatenated by the data concatenation unit 204 from among the data thus obtained. For example, the connection target determining unit 202 may determine data to be connected according to preset rules, and may exclude unrelated data from being connected.
 例えば、同一の種類のセンサで測定されたデータを連結の対象とするというルールを設定しておいてもよい。この場合、連結対象決定部202は、図3に示すように、蒸気量センサで測定された測定値を含む蒸気量センサデータ111A~111Cを連結の対象と決定する。また、連結対象決定部202は、温度センサで測定された測定値を含む温度センサデータ112A~112Cを連結の対象と決定する。なお、連結対象決定部202は、連結の対象であることを決定したデータに対し、共通の符号や識別情報を付与してもよい。 For example, a rule may be set in which data measured by the same type of sensor is to be linked. In this case, as shown in FIG. 3, the connection target determination unit 202 determines the steam amount sensor data 111A to 111C including the measured value measured by the steam amount sensor as the connection object. Furthermore, the connection target determination unit 202 determines the temperature sensor data 112A to 112C including the measured values measured by the temperature sensors as the connection targets. Note that the concatenation target determining unit 202 may assign a common code or identification information to data determined to be concatenation targets.
 また、例えば、連結対象決定部202は、上記のルールに加えて、あるいは上記のルールの代わりに、施設において同様の位置に設置されているセンサで測定されたデータは連結する、というルールに従って連結の対象とするデータを決定してもよい。例えば、図3における温度センサデータ112Aと112Cが何れも焼却炉の過熱器付近に設置された温度センサで測定されたものである一方、温度センサデータ112Bは他の位置に設置された温度センサで測定されたものであるとする。この場合、連結対象決定部202は、同様の位置に設置されている温度センサで測定された温度センサデータ112Aと112Cを連結の対象と決定し、温度センサデータ112Bは連結の対象としない。 For example, in addition to or instead of the above rules, the connection target determining unit 202 may connect data according to a rule that data measured by sensors installed at similar positions in a facility are connected. The target data may be determined. For example, temperature sensor data 112A and 112C in FIG. 3 are both measured by a temperature sensor installed near the superheater of the incinerator, while temperature sensor data 112B is measured by a temperature sensor installed at another location. Suppose that it has been measured. In this case, the connection target determination unit 202 determines the temperature sensor data 112A and 112C measured by temperature sensors installed at similar positions to be the connection targets, and does not consider the temperature sensor data 112B to be the connection target.
 (時系列データの生成例)
 図4は、時系列データの生成例を示す図である。より詳細には、図4には、図3に示した蒸気量センサデータ111A~111Cのそれぞれから、教師データの生成に用いられる時系列データを生成した例を示している。
(Example of time series data generation)
FIG. 4 is a diagram showing an example of generation of time series data. More specifically, FIG. 4 shows an example in which time-series data used to generate teacher data is generated from each of the steam amount sensor data 111A to 111C shown in FIG. 3.
 図4に示すように、蒸気量センサデータ111A~111Cには、時刻ごとの測定値(PV:Process Variable)が含まれている。また、図4に示す蒸気量センサデータ111A~111Cには、各測定値に対応する設定値も含まれている。設定値(SV:Set Variable)は、当該設定値に対応付けられた時刻において目標とする値(この例では蒸気量)を示す。つまり、蒸気量センサデータ111A~111Cが得られた施設A~Cでは、蒸気量が設定値に近付くように制御が行われている。 As shown in FIG. 4, the steam amount sensor data 111A to 111C include measured values (PV: Process Variable) at each time. Further, the steam amount sensor data 111A to 111C shown in FIG. 4 also include setting values corresponding to each measured value. The set value (SV: Set Variable) indicates a target value (in this example, the amount of steam) at the time associated with the set value. That is, in the facilities A to C from which the steam amount sensor data 111A to 111C were obtained, control is performed so that the steam amount approaches the set value.
 このように、データ取得部201は、測定値と設定値を含む蒸気量センサデータ111A~111Cを取得してもよい。そして、この場合、時系列データ生成部203は、蒸気量センサデータ111A~111Cに含まれる、施設A~Cで測定された測定値と当該測定値に対応する設定値とから、当該測定値と当該設定値との差または比を要素とする時系列データを生成してもよい。 In this manner, the data acquisition unit 201 may acquire the steam amount sensor data 111A to 111C including measured values and set values. In this case, the time series data generation unit 203 generates the measured value from the measured value measured at the facilities A to C and the setting value corresponding to the measured value, which is included in the steam amount sensor data 111A to 111C. Time series data may be generated in which the difference or ratio with respect to the set value is used as an element.
 例えば、図4の例では、時系列データ生成部203は、蒸気量センサデータ111A~111Cのそれぞれから、各時刻におけるPV/SVの値すなわち測定値と設定値との比を要素とする時系列データ113A~113Cを生成している。PV/SVの値が1に近い場合、施設の状態は正常であるといえる。なお、時系列データ生成部203は、PV/SVの代わりにPVとSVの差すなわち測定値と設定値の差を要素とする時系列データを生成してもよい。 For example, in the example of FIG. 4, the time-series data generation unit 203 generates a time-series data series from each of the steam amount sensor data 111A to 111C, in which the PV/SV value at each time, that is, the ratio between the measured value and the set value, is used as an element. Data 113A to 113C are generated. If the value of PV/SV is close to 1, it can be said that the state of the facility is normal. Note that the time series data generation unit 203 may generate time series data using the difference between PV and SV, that is, the difference between the measured value and the set value, as an element instead of PV/SV.
 また、時系列データ113A~113Cには、異常フラグという要素が含まれている。各時刻における異常フラグの値は、施設A~Cの状態が正常であるか否かを示している。具体的には、図4の例では、正常な状態である場合には異常フラグの値を0に、正常な状態ではない場合には異常フラグの値を1にしている。正常か否かの判定基準は適宜定めておけばよい。例えば、時系列データ生成部203は、PV/SVの値が所定の下限値未満である場合、および、PV/SVの値が所定の上限値を超える場合の少なくとも何れかの場合に正常ではないと判定し、他の場合に正常であると判定してもよい。 Additionally, the time series data 113A to 113C include an element called an abnormality flag. The value of the abnormality flag at each time indicates whether or not the status of facilities A to C is normal. Specifically, in the example of FIG. 4, the value of the abnormality flag is set to 0 when the state is normal, and the value of the abnormality flag is set to 1 when the state is not normal. Criteria for determining whether or not it is normal may be determined as appropriate. For example, the time series data generation unit 203 determines that the PV/SV value is not normal in at least one of the following cases: when the value of PV/SV is less than a predetermined lower limit value, and when the PV/SV value exceeds a predetermined upper limit value. It may be determined that the condition is normal in other cases.
 異常フラグの値は、機械学習における正解データ、言い換えれば生成する推論モデルの目的変数である。つまり、時系列データ113A~113Cを用いて生成される推論モデルは、施設の状態が正常であるか否かを推論するモデルとなる。なお、目的変数の対応付けは教師データ生成部205が行うようにしてもよい。 The value of the abnormality flag is correct data in machine learning, in other words, the objective variable of the inference model to be generated. In other words, the inference model generated using the time series data 113A to 113C is a model for inferring whether the state of the facility is normal or not. Note that the teacher data generation unit 205 may perform the association of objective variables.
 無論、推論モデルの目的変数は任意であり、施設の状態が正常であるか否かに限られない。つまり、時系列データ生成部203は、推論したい任意の目的変数を含む時系列データを生成すればよい。例えば、施設内の所定の機器に対する適切な制御量を推定する推論モデルを生成する場合、時系列データ生成部203は、当該機器に対する適切な制御量を含む時系列データを生成すればよい。なお、目的変数は時系列データ生成部203が自動で決定してもよいし、ユーザが入力してもよい。 Of course, the objective variable of the inference model is arbitrary and is not limited to whether the state of the facility is normal or not. In other words, the time-series data generation unit 203 only needs to generate time-series data that includes any target variable that is desired to be inferred. For example, when generating an inference model for estimating an appropriate control amount for a predetermined device in a facility, the time-series data generation unit 203 may generate time-series data including the appropriate control amount for the device. Note that the objective variable may be automatically determined by the time series data generation unit 203, or may be input by the user.
 また、蒸気量センサデータ111A~111Cに、センサ名やセンサID等が含まれている場合や、蒸気量センサデータ111A~111Cに含まれるデータのデータ型が異なっていることも想定される。このような場合には、時系列データ生成部203は、蒸気量センサデータ111A~111Cを整形して連結可能なデータに変換してもよい。 It is also assumed that the steam amount sensor data 111A to 111C include a sensor name, a sensor ID, etc., or that the data types of the data included in the steam amount sensor data 111A to 111C are different. In such a case, the time series data generation unit 203 may format the steam amount sensor data 111A to 111C and convert them into data that can be connected.
 以上のように、時系列データ生成部203は、施設で測定された測定値と当該測定値に対応する設定値とから、当該測定値と当該設定値との差または比を要素とする時系列データを、複数の施設(対象)のそれぞれについて生成してもよい。 As described above, the time series data generation unit 203 generates a time series based on the measured value measured at the facility and the set value corresponding to the measured value, the time series having the difference or ratio between the measured value and the set value as an element. Data may be generated for each of a plurality of facilities (objects).
 測定値と設定値との差や比は、施設で測定された測定値と比べて汎用性が高い指標である。よって、このような汎用性のある指標を要素とする時系列データを生成する上記の構成によれば、擬似的時系列データを自然なものとすることができる。つまり、上記の構成によれば、施設が異なることを原因として発生するデータのばらつきを減少させることができる。さらに、複数の施設のそれぞれについて収集されたデータを連結することが、時系列性による数値の変動に与えるバイアスを減少させることができる。よって、上記の構成によれば、推論精度の高い推論モデルを生成することが可能になる。 Differences and ratios between measured values and set values are indicators that are more versatile than measured values measured at facilities. Therefore, according to the above-described configuration for generating time-series data having such versatile indicators as elements, it is possible to make the pseudo-time-series data natural. That is, according to the above configuration, it is possible to reduce variations in data caused by different facilities. Furthermore, by linking the data collected for each of multiple facilities, it is possible to reduce the bias given to fluctuations in numerical values due to time series. Therefore, according to the above configuration, it is possible to generate an inference model with high inference accuracy.
 (擬似的時系列データおよび教師データの生成例)
 図5は、擬似的時系列データおよび教師データの生成例を示す図である。より詳細には、図5には、図4に示した時系列データ113A~113Cから生成された擬似的時系列データ114と、擬似的時系列データ114から生成された教師データ115とを示している。
(Example of generation of pseudo time series data and training data)
FIG. 5 is a diagram showing an example of generation of pseudo time series data and teacher data. More specifically, FIG. 5 shows pseudo time series data 114 generated from the time series data 113A to 113C shown in FIG. 4, and training data 115 generated from the pseudo time series data 114. There is.
 まず、データ連結部204による擬似的時系列データ114の生成について説明する。データ連結部204は、時系列データ113A~113Cをこの順序で連結して擬似的時系列データ114を生成する。ただし、時系列データ113A~113Cをそのまま連結しただけでは、「時刻」の値について重複や不整合が生じる。 First, generation of the pseudo time series data 114 by the data linking unit 204 will be explained. The data concatenation unit 204 generates pseudo time series data 114 by concatenating the time series data 113A to 113C in this order. However, if the time series data 113A to 113C are simply concatenated as they are, duplication or inconsistency will occur in the "time" values.
 そこで、データ連結部204は、連結を行うにあたり、時系列データ113A~113Cにおける時刻の値を1~15という連続する値に置き換えている。これらの値は、時系列データ113A~113Cに含まれる各データ要素の順序を示すものであり、順序情報と呼ぶことができる。元の「時刻」の値の代わりに新たな順序情報を与えることにより、データの連続性を保持しつつ、「時刻」の値に不整合が生じさせないようにすることができる。 Therefore, when performing the connection, the data connection unit 204 replaces the time values in the time series data 113A to 113C with continuous values of 1 to 15. These values indicate the order of each data element included in the time series data 113A to 113C, and can be called order information. By providing new order information in place of the original "time" value, it is possible to maintain data continuity and prevent inconsistencies in the "time" values.
 このように、データ連結部204は、複数の時系列データに一連の順序を示す順序情報を付与することにより複数の時系列データを連結してもよい。この構成によれば、複数の時系列データに一連の順序情報を付与するという簡易な処理で複数の時系列データを1つの擬似的時系列データとすることができる。 In this manner, the data linking unit 204 may link multiple pieces of time-series data by providing order information indicating a series of orders to the plurality of pieces of time-series data. According to this configuration, a plurality of time series data can be made into one pseudo time series data by a simple process of adding a series of order information to a plurality of time series data.
 なお、一般的に、データの分布が異なる複数の変数を用いて推論モデルを作成する場合には、変数を標準化処理または正規化処理することで高精度な推論モデルを作成することができる。ここで、標準化処理または正規化処理を実施するためには、対象となるデータセットの統計量が必要となるが、その統計量はデータセットに固有のものとなる。そのため、従来技術では、標準化処理または正規化処理後の時系列データを連結して単一のデータとすることは不可能であるか、もしくは仮に連結できたとしてもそのようなデータを用いた学習は推論モデルの性能を著しく低下させてしまう。 Note that in general, when creating an inference model using multiple variables with different data distributions, a highly accurate inference model can be created by standardizing or normalizing the variables. Here, in order to perform standardization processing or normalization processing, statistics of the target data set are required, but the statistics are unique to the data set. Therefore, with conventional technology, it is impossible to combine time series data after standardization processing or normalization processing into a single data, or even if it is possible to combine it, learning using such data is difficult. will significantly degrade the performance of the inference model.
 これに対し、上記構成によれば、事前に複数の時系列データを連結して単一の疑似的時系列データとするため、標準化処理または正規化処理と複数の時系列データの連結処理を両立させることができる。 On the other hand, according to the above configuration, since multiple time series data are concatenated in advance to form a single pseudo time series data, standardization processing or normalization processing and concatenation processing of multiple time series data are compatible. can be done.
 また、データ連結部204は、複数の時系列データにおける連結部分の数値の不連続性を軽減する処理を実施した上で連結を行ってもよい。これにより、連結部分の数値の不連続性を軽減して連結後の擬似的時系列データに連続性を持たせることができ、連結部分の数値変化が自然な擬似的時系列データを生成することが可能になる。 Furthermore, the data concatenation unit 204 may perform concatenation after performing processing to reduce discontinuity in numerical values of concatenated portions of a plurality of time series data. As a result, it is possible to reduce the discontinuity of numerical values in the connected part and give continuity to the connected pseudo time series data, and to generate pseudo time series data in which numerical changes in the connected part are natural. becomes possible.
 また、上記の構成によれば、複数の時系列データにおける連結部分の数値の不連続性が軽減されるため、疑似的時系列データに対して移動平均の算出等の時系列データに適用される一般的な前処理を適用することが可能になる。これにより、推論モデルのさらなる推論精度向上が期待できる。 In addition, according to the above configuration, discontinuity in numerical values of connected parts in multiple time series data is reduced, so it can be applied to time series data such as calculating a moving average for pseudo time series data. It becomes possible to apply common pretreatments. This can be expected to further improve the inference accuracy of the inference model.
 なお、不連続性を軽減する処理は、複数の時系列データにおける連結部分の数値の不連続性が軽減されるようなものであればよく、特に限定されない。例えば、不連続性を軽減する処理として、複数の時系列データにおける連結部分の数値を平滑化する処理を適用してもよい。また、例えば、不連続性を軽減する処理として、連結する時系列データのそれぞれの結合部の数値の一部を、不連続性が軽減されるように削除する処理を適用してもよい。この場合、例えば、結合部間の差が閾値以下となるように数値の削除を行ってもよい。 Note that the process for reducing discontinuity is not particularly limited as long as it reduces the discontinuity in numerical values of connected parts in a plurality of time series data. For example, as a process for reducing discontinuity, a process for smoothing numerical values of connected parts in a plurality of time series data may be applied. Further, for example, as a process for reducing discontinuity, a process may be applied in which a part of the numerical value of each joint part of the time series data to be concatenated is deleted so as to reduce the discontinuity. In this case, for example, numerical values may be deleted so that the difference between the combined parts becomes less than or equal to a threshold value.
 例えば、データ連結部204は、擬似的時系列データ114における時刻=6のときのPV/SVの値を、当該値と時刻=5のときのPV/SVの値との平均値に置き換えてもよい。この場合、時刻=6におけるPV/SVの値は、(0.989+1.035)/2=1.012となり、時系列データ113Aと113Bの連結部分におけるPV/SVの値の変化を滑らかにすることができる。 For example, the data linking unit 204 may replace the value of PV/SV at time = 6 in the pseudo time series data 114 with the average value of the value and the value of PV/SV at time = 5. good. In this case, the value of PV/SV at time = 6 is (0.989+1.035)/2=1.012, smoothing the change in the value of PV/SV in the connected part of time series data 113A and 113B. be able to.
 また、例えば、データ連結部204は、擬似的時系列データ114における時刻=10のときのPV/SVの値を削除してもよい。これにより、時系列データ113Bと113Cの連結部分におけるPV/SVの値の変化を滑らかにすることができる。 Furthermore, for example, the data linking unit 204 may delete the value of PV/SV at time=10 in the pseudo time series data 114. This allows smooth changes in the PV/SV values in the connected portion of the time series data 113B and 113C.
 続いて、教師データ生成部205による教師データ115の生成について説明する。図5の例では、教師データ生成部205は、擬似的時系列データ114に対して標準化処理を施すことにより教師データ115を生成している。具体的には、教師データ生成部205は、擬似的時系列データ114に含まれる、測定値、設定値、およびPV/SVという3つのデータ要素のそれぞれについて標準化処理を行ったものを教師データ115としている。 Next, generation of the teacher data 115 by the teacher data generation unit 205 will be explained. In the example of FIG. 5, the teacher data generation unit 205 generates the teacher data 115 by performing standardization processing on the pseudo time series data 114. Specifically, the teacher data generation unit 205 generates the teacher data 115 by standardizing each of the three data elements included in the pseudo time series data 114, namely, the measured value, the setting value, and PV/SV. It is said that
 具体的には、教師データ生成部205は、擬似的時系列データ114に含まれるデータ要素の値と平均値との差を標準偏差で割る、という処理を、各データ要素について行うことにより教師データ115を生成する。例えば、擬似的時系列データ114では、測定値の平均値が9.387であり、標準偏差が0.519である。よって、教師データ生成部205は、擬似的時系列データ114における時刻=1の測定値である9.8については、(9.8-9.387)/0.519=0.795との値に標準化している。 Specifically, the training data generation unit 205 generates training data by performing a process for each data element of dividing the difference between the value of the data element included in the pseudo time series data 114 and the average value by the standard deviation. 115 is generated. For example, in the pseudo time series data 114, the average value of the measured values is 9.387, and the standard deviation is 0.519. Therefore, the teacher data generation unit 205 generates a value of (9.8-9.387)/0.519=0.795 for 9.8, which is the measured value at time = 1 in the pseudo time series data 114. It has been standardized.
 また、教師データ生成部205は、標準化処理の代わりに正規化処理を施して教師データを生成してもよい。この場合、教師データ生成部205は、擬似的時系列データ114に含まれるデータ要素の最大値と最小値をそれぞれ算出する。そして、教師データ生成部205は、データ要素の値と算出した上記最小値との差を、データ範囲すなわち上記最大値と上記最小値との差で割る、という処理を、各データ要素について行うことにより教師データを生成する。 Furthermore, the teacher data generation unit 205 may generate the teacher data by performing normalization processing instead of standardization processing. In this case, the teacher data generation unit 205 calculates the maximum value and minimum value of the data elements included in the pseudo time series data 114, respectively. Then, the teacher data generation unit 205 performs a process for each data element of dividing the difference between the value of the data element and the calculated minimum value by the data range, that is, the difference between the maximum value and the minimum value. Generate training data by
 以上のように、教師データ生成部205は、擬似的時系列データ114に対して標準化処理または正規化処理を施すことにより教師データ115を生成する。これにより、施設ごとの時系列データの特性を吸収した推論モデル、言い換えれば施設ごとの時系列データの特性を反映した推論モデルを生成することが可能になっている。 As described above, the teacher data generation unit 205 generates the teacher data 115 by performing standardization processing or normalization processing on the pseudo time series data 114. This makes it possible to generate an inference model that absorbs the characteristics of time-series data for each facility, in other words, an inference model that reflects the characteristics of time-series data for each facility.
 なお、標準化処理または正規化処理の条件(平均、標準偏差、最大値、および最小値などの統計量)は、生成された推論モデルを用いた推論時にも使用される。このため、教師データ生成部205は、通信部22を介した通信により、標準化処理または正規化処理の条件を図2に示す情報処理装置1A~1Dに通知してもよい。 Note that the conditions for standardization processing or normalization processing (statistics such as average, standard deviation, maximum value, and minimum value) are also used during inference using the generated inference model. For this reason, the teacher data generation unit 205 may notify the conditions for standardization processing or normalization processing to the information processing apparatuses 1A to 1D shown in FIG. 2 through communication via the communication unit 22.
 学習部206は、以上のようにして生成された教師データ115を用いて機械学習を行い、推論モデルを生成する。推論モデルの目的変数は、異常フラグの値とすればよい。また、推論モデルの説明変数は、測定値、設定値、およびPV/SVの値の少なくとも何れかとすればよい。また、学習部206は、これら以外の値を説明変数に含めてもよい。例えば、学習部206は、温度センサデータ112Aに含まれる、温度の時系列の測定値、設定値、およびPV/SVの値の少なくとも何れかを説明変数に含めてもよい。また、これらの値を標準化処理または正規化処理した値の少なくとも何れかを説明変数に含めてもよい。 The learning unit 206 performs machine learning using the teacher data 115 generated as described above to generate an inference model. The objective variable of the inference model may be the value of the abnormality flag. Further, the explanatory variable of the inference model may be at least one of a measured value, a set value, and a PV/SV value. Further, the learning unit 206 may include values other than these in the explanatory variables. For example, the learning unit 206 may include, as explanatory variables, at least one of a time-series temperature measurement value, a setting value, and a PV/SV value included in the temperature sensor data 112A. Furthermore, at least one of the values obtained by standardizing or normalizing these values may be included in the explanatory variables.
 なお、図5に示す教師データ115には、施設Dで収集されたデータは反映されていないが、このような教師データ115を用いて生成された推論モデルは、施設Dで収集された推論用データを用いた推論にも適用することが可能である。 Although the data collected at facility D is not reflected in the teacher data 115 shown in FIG. 5, the inference model generated using such teacher data 115 is based on the data collected at facility D. It can also be applied to inference using data.
 (情報処理装置2が実行する処理の流れ)
 情報処理装置2が実行する処理の流れを図6に基づいて説明する。図6は、情報処理装置2が推論モデルを生成する際の処理の一例を示すフローチャートである。なお、詳細は後述するが、図6のフローチャートに示される一連の処理には、教師データの生成方法と推論モデルの生成方法とが含まれている。
(Flow of processing executed by information processing device 2)
The flow of processing executed by the information processing device 2 will be explained based on FIG. 6. FIG. 6 is a flowchart illustrating an example of processing when the information processing device 2 generates an inference model. Although details will be described later, the series of processes shown in the flowchart of FIG. 6 includes a method of generating teacher data and a method of generating an inference model.
 S11では、データ取得部201が、各施設で得られたデータを取得する。例えば、データ取得部201は、図2に示される施設A~Cでそれぞれ収集された時系列のセンシングデータを情報処理装置1A~1C経由で取得してもよい。なお、各施設で収集されたデータは、予め記憶部21等に記憶させておいてもよく、この場合、データ取得部201は記憶部21等からデータを取得すればよい。 In S11, the data acquisition unit 201 acquires data obtained at each facility. For example, the data acquisition unit 201 may acquire time-series sensing data collected at the facilities A to C shown in FIG. 2, respectively, via the information processing devices 1A to 1C. Note that the data collected at each facility may be stored in advance in the storage unit 21 or the like, and in this case, the data acquisition unit 201 may acquire the data from the storage unit 21 or the like.
 S12では、連結対象決定部202が、S11で取得されたデータの中からデータ連結部204による連結の対象とするものを決定する。図3に基づいて説明したように、連結対象決定部202は、予め設定されたルールに従って連結の対象とするデータを決定し、関連性のないデータは連結の対象としないようにすればよい。 In S12, the connection target determination unit 202 determines the data to be connected by the data connection unit 204 from among the data acquired in S11. As described based on FIG. 3, the connection target determination unit 202 may determine data to be connected according to preset rules, and may exclude unrelated data from being connected.
 S13では、時系列データ生成部203が、S12で連結の対象と決定されたデータから、データ連結部204による連結に用いられる時系列データを生成する。具体的には、時系列データ生成部203は、S12で連結の対象と決定されたデータに含まれる測定値と当該測定値に対応する設定値とから、当該測定値と当該設定値との差または比を要素とする時系列データを生成する。この処理は、測定値を用いて推論モデルの説明変数を生成する処理であるともいえる。また、時系列データ生成部203は、S13において、生成する時系列データに、推論モデルの目的変数となる値を対応付ける処理についても行うようにしてもよい。 In S13, the time-series data generation unit 203 generates time-series data to be used for the connection by the data connection unit 204 from the data determined to be the object of connection in S12. Specifically, the time series data generation unit 203 calculates the difference between the measured value and the set value from the measured value included in the data determined to be linked in S12 and the set value corresponding to the measured value. Or generate time series data with ratio as an element. This process can also be said to be a process of generating explanatory variables for an inference model using measured values. Further, the time series data generation unit 203 may also perform a process of associating the generated time series data with a value that is an objective variable of the inference model in S13.
 S14(データ連結ステップ)では、データ連結部204が、複数の施設のそれぞれで収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成する。 In S14 (data concatenation step), the data concatenation unit 204 concatenates a plurality of time series data based on data collected at each of a plurality of facilities to generate one pseudo time series data.
 S15(教師データ生成ステップ)では、教師データ生成部205が、S14で生成された擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする。もしS13で時系列データに対して目的変数となる値の対応付けが行われていなければ、教師データ生成部205は、S15で擬似的時系列データに対して当該値の対応付けを行う。 In S15 (teacher data generation step), the teacher data generation unit 205 performs standardization processing or normalization processing on the pseudo time series data generated in S14, and uses it as teacher data. If the time-series data is not associated with the value to be the target variable in S13, the teacher data generation unit 205 associates the value with the pseudo-time-series data in S15.
 S16(学習ステップ)では、学習部206が、S15で生成された教師データを用いた機械学習により推論モデルを生成し、これにより図6の処理は終了する。なお、学習部206は、生成した推論モデルを情報処理装置1に送信してもよい。また、この際には、S15における標準化処理または正規化処理で用いた統計量も送信してもよい。 In S16 (learning step), the learning unit 206 generates an inference model by machine learning using the teacher data generated in S15, and thus the process in FIG. 6 ends. Note that the learning unit 206 may transmit the generated inference model to the information processing device 1. Furthermore, at this time, the statistical amount used in the standardization process or normalization process in S15 may also be transmitted.
 以上説明した図6の処理には教師データの生成方法が含まれている。すなわち、情報処理装置2が実行する教師データの生成方法は、データ連結ステップ(S14)と教師データ生成ステップ(S15)と、を含む。データ連結ステップ(S14)は、複数の施設(対象)のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成する。教師データ生成ステップ(S15)は、S14で生成された擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする。これにより、汎用性の高い推論モデルを生成することが可能な教師データを生成することができる。 The process of FIG. 6 described above includes a method of generating teacher data. That is, the teacher data generation method executed by the information processing device 2 includes a data linking step (S14) and a teacher data generation step (S15). The data linking step (S14) connects a plurality of time series data based on data collected for each of a plurality of facilities (objects) to generate one pseudo time series data. In the teacher data generation step (S15), the pseudo time series data generated in S14 is subjected to standardization processing or normalization processing to become teacher data. Thereby, it is possible to generate training data that can generate a highly versatile inference model.
 また、図6の処理には推論モデルの生成方法も含まれている。すなわち、情報処理装置2が実行する推論モデルの生成方法は、データ連結ステップ(S14)と、教師データ生成ステップ(S15)と、学習ステップ(S16)と、を含む。データ連結ステップ(S14)は、複数の施設(対象)のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成する。教師データ生成ステップ(S15)は、S14で生成された擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする。学習ステップ(S16)は、S15で生成された教師データを用いた機械学習により推論モデルを生成する。よって、汎用性の高い推論モデルを生成することが可能である。 Additionally, the process in FIG. 6 also includes a method for generating an inference model. That is, the inference model generation method executed by the information processing device 2 includes a data linking step (S14), a teacher data generation step (S15), and a learning step (S16). The data linking step (S14) connects a plurality of time series data based on data collected for each of a plurality of facilities (objects) to generate one pseudo time series data. In the teacher data generation step (S15), the pseudo time series data generated in S14 is subjected to standardization processing or normalization processing to become teacher data. In the learning step (S16), an inference model is generated by machine learning using the teacher data generated in S15. Therefore, it is possible to generate a highly versatile inference model.
 (情報処理装置1が実行する処理の流れ)
 情報処理装置1のデータ取得部101は、S16で生成された推論モデルと、S15における標準化処理または正規化処理で使用された統計量とを取得する。また、データ取得部101は、推論の対象となる施設について収集された推論用データを取得する。例えば、図2の情報処理装置1Dであれば、施設Dで収集された推論用データが取得される。
(Flow of processing executed by information processing device 1)
The data acquisition unit 101 of the information processing device 1 acquires the inference model generated in S16 and the statistics used in the standardization process or normalization process in S15. Furthermore, the data acquisition unit 101 acquires inference data collected about a facility that is a target of inference. For example, in the case of the information processing apparatus 1D in FIG. 2, inference data collected at the facility D is acquired.
 次に、前処理部102が、上記の統計量を用いて推論用データを標準化処理または正規化処理して推論モデルの入力データを生成する。続いて、推論部103が、前処理部102が生成した入力データを推論モデルに入力し、推論モデルが出力する出力値に基づいて推論結果を得る。そして、制御量決定部104が、上記の推論結果に基づいて対象となる施設に設けられた機器に対する制御量を決定する。 Next, the preprocessing unit 102 performs standardization processing or normalization processing on the inference data using the above-mentioned statistics to generate input data for the inference model. Subsequently, the inference unit 103 inputs the input data generated by the preprocessing unit 102 to the inference model, and obtains an inference result based on the output value output by the inference model. Then, the control amount determination unit 104 determines the control amount for the equipment installed in the target facility based on the above inference result.
 なお、従来は、図2のように複数の施設が存在する場合、施設ごとに収集された時系列データを、施設ごとに標準化処理または正規化処理して、施設ごとに教師データを生成し、各施設専用の推論モデルを生成していた。このため、各施設において、推論用データを用いた推論の際には、その施設において適用された統計量を用いて標準化処理または正規化処理を行う必要があった。 Conventionally, when there are multiple facilities as shown in Figure 2, the time-series data collected for each facility is standardized or normalized for each facility to generate teacher data for each facility. An inference model was generated specifically for each facility. Therefore, when making inferences using inference data at each facility, it was necessary to perform standardization processing or normalization processing using statistics applied at that facility.
 これに対し、図2の情報処理システム5では、情報処理装置2が、施設ごとに収集された時系列データを連結して1つの擬似的時系列データとした上で標準化処理または正規化処理する。このため、何れの施設において収集された推論用データについても、同じ統計量を用いて標準化処理または正規化処理を行うことができる。 On the other hand, in the information processing system 5 of FIG. 2, the information processing device 2 connects the time series data collected for each facility to form one pseudo time series data, and then performs standardization processing or normalization processing. . Therefore, standardization processing or normalization processing can be performed on inference data collected at any facility using the same statistical amount.
 〔変形例〕
 上述の各実施形態で説明した各処理の実行主体は任意であり、上述の例に限られない。つまり、上述の各実施形態で説明した各処理を実行可能であれば、情報処理システム5を構成する装置は適宜変更することができる。
[Modified example]
The execution entity of each process described in each of the above-mentioned embodiments is arbitrary and is not limited to the above-mentioned examples. In other words, the devices constituting the information processing system 5 can be changed as appropriate, as long as they can execute the processes described in each of the above-described embodiments.
 例えば、図2の情報処理システム5では、情報処理装置1A~1Dとは別の情報処理装置2が教師データの生成と推論モデルの生成とを行っているが、情報処理装置1A~1Dの何れかが教師データの生成と推論モデルの生成とを行ってもよい。また、図2の情報処理システム5では、情報処理装置1A~1Dが施設A~Dにおけるデータの取集と、推論モデルを用いた推論とを行っているが、これらの処理をそれぞれ別の情報処理装置に実行させてもよい。 For example, in the information processing system 5 in FIG. 2, an information processing device 2 different from the information processing devices 1A to 1D generates teacher data and an inference model, but any of the information processing devices 1A to 1D Alternatively, the teacher data and the inference model may be generated. Furthermore, in the information processing system 5 in FIG. 2, the information processing devices 1A to 1D collect data in the facilities A to D and perform inference using an inference model, but these processes are performed using separate information. It may be executed by a processing device.
 また、図2の情報処理システム5では、1つの情報処理装置2で教師データの生成と推論モデルの生成とを行っているが、教師データの生成と推論モデルの生成とは、それぞれ別の情報処理装置に実行させることもできる。 Furthermore, in the information processing system 5 of FIG. 2, one information processing device 2 generates the teacher data and the inference model, but the generation of the teacher data and the generation of the inference model are performed using different information. It can also be executed by a processing device.
 また、図6に記載の各処理の実行主体は、必ずしも1つの装置である必要はなく、それらの処理を複数の任意の情報処理装置(コンピュータ)に分担させて実行することができる。例えば、図6のS11~S15の処理を情報処理装置2に実行させ、S16の処理は他の情報処理装置に実行させてもよい。 Moreover, the execution entity of each process described in FIG. 6 does not necessarily have to be one device, and the processing can be shared and executed by a plurality of arbitrary information processing devices (computers). For example, the information processing device 2 may execute the processing of S11 to S15 in FIG. 6, and the processing of S16 may be executed by another information processing device.
 〔ソフトウェアによる実現例〕
 情報処理装置1および2(以下、「装置」と呼ぶ)の機能は、当該装置としてコンピュータを機能させるためのプログラムであって、当該装置の各制御ブロック(特に制御部10および20に含まれる各部)としてコンピュータを機能させるためのプログラムにより実現することができる。例えば、情報処理装置2における推論モデルの生成機能は、推論モデル生成プログラムにより実現することができ、情報処理装置2における教師データ生成機能は、教師データ生成プログラムにより実現することができる。
[Example of implementation using software]
The functions of the information processing devices 1 and 2 (hereinafter referred to as "devices") are programs for making a computer function as the devices, and each control block of the devices (particularly each unit included in the control units 10 and 20). ) can be realized by a program for making a computer function. For example, the inference model generation function in the information processing device 2 can be realized by an inference model generation program, and the teacher data generation function in the information processing device 2 can be realized by a teacher data generation program.
 この場合、上記装置は、上記プログラムを実行するためのハードウェアとして、少なくとも1つの制御装置(例えばプロセッサ)と少なくとも1つの記憶装置(例えばメモリ)を有するコンピュータを備えている。この制御装置と記憶装置により上記プログラムを実行することにより、上記各実施形態で説明した各機能が実現される。 In this case, the device includes a computer having at least one control device (for example, a processor) and at least one storage device (for example, a memory) as hardware for executing the program. By executing the above program using this control device and storage device, each function described in each of the above embodiments is realized.
 上記プログラムは、一時的ではなく、コンピュータ読み取り可能な、1または複数の記録媒体に記録されていてもよい。この記録媒体は、上記装置が備えていてもよいし、備えていなくてもよい。後者の場合、上記プログラムは、有線または無線の任意の伝送媒体を介して上記装置に供給されてもよい。 The above program may be recorded on one or more computer-readable recording media instead of temporary. This recording medium may or may not be included in the above device. In the latter case, the program may be supplied to the device via any transmission medium, wired or wireless.
 また、上記各制御ブロックの機能の一部または全部は、論理回路により実現することも可能である。例えば、上記各制御ブロックとして機能する論理回路が形成された集積回路も本発明の範疇に含まれる。この他にも、例えば量子コンピュータにより上記各制御ブロックの機能を実現することも可能である。 Furthermore, part or all of the functions of each of the control blocks described above can also be realized by a logic circuit. For example, an integrated circuit in which a logic circuit functioning as each of the control blocks described above is formed is also included in the scope of the present invention. In addition to this, it is also possible to realize the functions of each of the control blocks described above using, for example, a quantum computer.
 〔まとめ〕
 本発明の態様1に係る情報処理装置は、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結部と、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成部と、前記教師データを用いた機械学習により推論モデルを生成する学習部と、を備える。
〔summary〕
The information processing device according to aspect 1 of the present invention includes: a data linking unit that connects a plurality of time series data based on data collected for each of a plurality of objects to generate one pseudo time series data; The present invention includes a teacher data generation section that performs standardization processing or normalization processing on the standard time series data to produce teacher data, and a learning section that generates an inference model by machine learning using the teacher data.
 本発明の態様2に係る情報処理装置では、前記態様1において、前記データ連結部は、複数の前記時系列データに一連の順序を示す順序情報を付与することにより複数の前記時系列データを連結する構成であってもよい。 In the information processing device according to aspect 2 of the present invention, in aspect 1, the data linking unit connects the plurality of time series data by providing order information indicating a series of orders to the plurality of time series data. The configuration may be such that
 本発明の態様3に係る情報処理装置では、前記態様1または2において、前記データ連結部は、複数の前記時系列データにおける連結部分の数値の不連続性を軽減する処理を実施した上で連結を行う構成であってもよい。 In the information processing device according to aspect 3 of the present invention, in aspect 1 or 2, the data concatenation unit performs the process of reducing numerical discontinuity of the concatenated portions in the plurality of time series data, and then concatenates the data. It may also be configured to perform the following.
 本発明の態様4に係る情報処理装置では、前記態様1から3の何れかにおいて、前記対象について測定された測定値と当該測定値に対応する設定値とから、当該測定値と当該設定値との差または比を要素とする前記時系列データを、複数の前記対象のそれぞれについて生成する時系列データ生成部を備える。 In the information processing apparatus according to aspect 4 of the present invention, in any one of aspects 1 to 3, the measurement value and the setting value are determined from the measurement value measured for the object and the setting value corresponding to the measurement value. The apparatus further includes a time-series data generation unit that generates the time-series data for each of the plurality of objects, the time-series data having a difference or ratio as an element.
 本発明の態様5に係る情報処理装置は、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結部と、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成部と、を備える。 An information processing device according to aspect 5 of the present invention includes: a data linking unit that connects a plurality of time series data based on data collected for each of a plurality of objects to generate one pseudo time series data; and a teacher data generation unit that performs standardization processing or normalization processing on the historical time series data and generates teacher data.
 本発明の態様6に係る生成方法は、1または複数の情報処理装置が実行する推論モデルの生成方法であって、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結ステップと、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成ステップと、前記教師データを用いた機械学習により推論モデルを生成する学習ステップと、を含む。 A generation method according to aspect 6 of the present invention is an inference model generation method executed by one or more information processing devices, which connects a plurality of time series data based on data collected for each of a plurality of objects. a data concatenation step of generating one pseudo time series data using the above data, a teacher data generation step of performing standardization processing or normalization processing on the pseudo time series data to obtain training data, and a step of generating training data using the training data. A learning step of generating an inference model by machine learning.
 本発明の態様7に係る生成方法は、1または複数の情報処理装置が実行する教師データの生成方法であって、複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結ステップと、前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成ステップと、を含む。 A generation method according to aspect 7 of the present invention is a training data generation method executed by one or more information processing devices, which connects a plurality of time series data based on data collected for each of a plurality of objects. The method includes a data concatenation step of generating one piece of pseudo time series data, and a teacher data generation step of subjecting the pseudo time series data to standardization processing or normalization processing to obtain teacher data.
 本発明の態様8に係る推論モデル生成プログラムは、前記態様1に記載の情報処理装置としてコンピュータを機能させるための推論モデル生成プログラムであって、前記データ連結部、前記教師データ生成部、および前記学習部としてコンピュータを機能させる。 The inference model generation program according to aspect 8 of the present invention is an inference model generation program for causing a computer to function as the information processing apparatus according to aspect 1, and includes the data linking section, the teacher data generation section, and the inference model generation program. Make the computer function as a learning department.
 本発明の態様9に係る教師データ生成プログラムは、前記態様5に記載の情報処理装置としてコンピュータを機能させるための教師データ生成プログラムであって、前記データ連結部および前記教師データ生成部としてコンピュータを機能させる。 A teacher data generation program according to aspect 9 of the present invention is a teacher data generation program for causing a computer to function as the information processing apparatus according to aspect 5, wherein the computer is used as the data linking section and the teacher data generation section. Make it work.
 本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the embodiments described above, and various modifications can be made within the scope of the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. are also included within the technical scope of the present invention.
 2   情報処理装置
 203 時系列データ生成部
 204 データ連結部
 205 教師データ生成部
 206 学習部
 111A、111B、111C 蒸気量センサデータ(測定値、設定値)
 112A、112B、112C 温度センサデータ(測定値、設定値)
 113A、113B 時系列データ
 114 疑似的時系列データ
 115 教師データ
2 Information processing device 203 Time series data generation section 204 Data connection section 205 Teacher data generation section 206 Learning section 111A, 111B, 111C Steam amount sensor data (measured value, set value)
112A, 112B, 112C Temperature sensor data (measured value, set value)
113A, 113B Time series data 114 Pseudo time series data 115 Training data

Claims (9)

  1.  複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結部と、
     前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成部と、
     前記教師データを用いた機械学習により推論モデルを生成する学習部と、を備える情報処理装置。
    a data concatenation unit that generates one pseudo time series data by concatenating a plurality of time series data based on data collected for each of the plurality of targets;
    a teacher data generation unit that performs standardization processing or normalization processing on the pseudo time series data to obtain teacher data;
    An information processing device comprising: a learning unit that generates an inference model by machine learning using the teacher data.
  2.  前記データ連結部は、複数の前記時系列データに一連の順序を示す順序情報を付与することにより複数の前記時系列データを連結する、請求項1に記載の情報処理装置。 The information processing device according to claim 1, wherein the data linking unit connects the plurality of time-series data by providing order information indicating a series of orders to the plurality of time-series data.
  3.  前記データ連結部は、複数の前記時系列データにおける連結部分の数値の不連続性を軽減する処理を実施した上で連結を行う、請求項1または2に記載の情報処理装置。 The information processing device according to claim 1 or 2, wherein the data linking unit performs the linking after performing processing to reduce discontinuity in numerical values of linked portions in the plurality of time series data.
  4.  前記対象について測定された測定値と当該測定値に対応する設定値とから、当該測定値と当該設定値との差または比を要素とする前記時系列データを、複数の前記対象のそれぞれについて生成する時系列データ生成部を備える、請求項1または2に記載の情報処理装置。 Generate, for each of the plurality of objects, the time-series data whose element is the difference or ratio between the measured value and the setting value, from the measurement value measured for the object and the setting value corresponding to the measurement value. The information processing device according to claim 1 or 2, further comprising a time series data generation unit.
  5.  複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結部と、
     前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成部と、を備える情報処理装置。
    a data concatenation unit that generates one pseudo time series data by concatenating a plurality of time series data based on data collected for each of the plurality of targets;
    An information processing device comprising: a teacher data generation unit that performs standardization processing or normalization processing on the pseudo time series data to generate teacher data.
  6.  1または複数の情報処理装置が実行する推論モデルの生成方法であって、
     複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結ステップと、
     前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成ステップと、
     前記教師データを用いた機械学習により推論モデルを生成する学習ステップと、を含む推論モデルの生成方法。
    A method for generating an inference model executed by one or more information processing devices, the method comprising:
    a data linking step of linking multiple time series data based on data collected for each of the multiple targets to generate one pseudo time series data;
    a training data generation step of performing standardization processing or normalization processing on the pseudo time series data to obtain training data;
    A method for generating an inference model, comprising: a learning step of generating an inference model by machine learning using the teacher data.
  7.  1または複数の情報処理装置が実行する教師データの生成方法であって、
     複数の対象のそれぞれについて収集されたデータに基づく複数の時系列データを連結して1つの擬似的時系列データを生成するデータ連結ステップと、
     前記擬似的時系列データに対して標準化処理または正規化処理を施して教師データとする教師データ生成ステップと、を含む教師データの生成方法。
    A teaching data generation method executed by one or more information processing devices, the method comprising:
    a data linking step of linking multiple time series data based on data collected for each of the multiple targets to generate one pseudo time series data;
    A method for generating teacher data, comprising the step of generating teacher data by subjecting the pseudo time series data to standardization processing or normalization processing to obtain teacher data.
  8.  請求項1に記載の情報処理装置としてコンピュータを機能させるための推論モデル生成プログラムであって、前記データ連結部、前記教師データ生成部、および前記学習部としてコンピュータを機能させるための推論モデル生成プログラム。 An inference model generation program for causing a computer to function as the information processing device according to claim 1, the inference model generation program for causing the computer to function as the data linking section, the teacher data generation section, and the learning section. .
  9.  請求項5に記載の情報処理装置としてコンピュータを機能させるための教師データ生成プログラムであって、前記データ連結部および前記教師データ生成部としてコンピュータを機能させるための教師データ生成プログラム。 A teacher data generation program for causing a computer to function as the information processing device according to claim 5, the teacher data generation program for causing the computer to function as the data linking unit and the teacher data generation unit.
PCT/JP2023/020963 2022-07-06 2023-06-06 Information processing device, inference model generation method, training data generation method, inference model generation program, and training data generation program WO2024009667A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-109243 2022-07-06
JP2022109243A JP2024007873A (en) 2022-07-06 2022-07-06 Information processing device, generation method of inference model, generation method of teacher data, inference model generation program, and teacher data generation program

Publications (1)

Publication Number Publication Date
WO2024009667A1 true WO2024009667A1 (en) 2024-01-11

Family

ID=89453161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/020963 WO2024009667A1 (en) 2022-07-06 2023-06-06 Information processing device, inference model generation method, training data generation method, inference model generation program, and training data generation program

Country Status (3)

Country Link
JP (1) JP2024007873A (en)
TW (1) TW202403485A (en)
WO (1) WO2024009667A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161810A1 (en) * 2008-12-19 2010-06-24 Sun Microsystems, Inc. Generating a training data set for a pattern-recognition model for electronic prognostication for a computer system
JP2021170244A (en) * 2020-04-16 2021-10-28 株式会社日立製作所 Learning model construction system and method of the same
EP3992739A1 (en) * 2020-10-29 2022-05-04 Siemens Aktiengesellschaft Automatically generating training data of a time series of sensor data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161810A1 (en) * 2008-12-19 2010-06-24 Sun Microsystems, Inc. Generating a training data set for a pattern-recognition model for electronic prognostication for a computer system
JP2021170244A (en) * 2020-04-16 2021-10-28 株式会社日立製作所 Learning model construction system and method of the same
EP3992739A1 (en) * 2020-10-29 2022-05-04 Siemens Aktiengesellschaft Automatically generating training data of a time series of sensor data

Also Published As

Publication number Publication date
TW202403485A (en) 2024-01-16
JP2024007873A (en) 2024-01-19

Similar Documents

Publication Publication Date Title
US8577822B2 (en) Data-driven approach to modeling sensors wherein optimal time delays are determined for a first set of predictors and stored as a second set of predictors
JP4852043B2 (en) System, device, and method for updating system monitoring model
EP1676179B1 (en) Detecting faults of system components in a continuous process
US6192352B1 (en) Artificial neural network and fuzzy logic based boiler tube leak detection systems
Safiyullah et al. Prediction on performance degradation and maintenance of centrifugal gas compressors using genetic programming
JPH03164804A (en) Process control system and power plant process control system
US20060025961A1 (en) Method and device for monitoring a technical installation comprising several systems, in particular an electric power station
US20070135938A1 (en) Methods and systems for predictive modeling using a committee of models
CN114365125A (en) Information processing device, operation support system, information processing method, and information processing program
WO2024009667A1 (en) Information processing device, inference model generation method, training data generation method, inference model generation program, and training data generation program
CN104081298B (en) The system and method for automatization's manipulation of the workflow in automatization and/or electrical engineering project
Zhao et al. Gas turbine exhaust system health management based on recurrent neural networks
Palmé et al. Hybrid modeling of heavy duty gas turbines for on-line performance monitoring
JPS62169920A (en) Multi-variable automatic combustion control of incinerator
US20220121195A1 (en) Predictive Maintenance Tool Based on Digital Model
JP2650914B2 (en) Process abnormality diagnosis device
JP2005078545A (en) Method and device for adjusting process model
JP7125383B2 (en) Information processing device, information processing method, and information processing program
US20230315081A1 (en) Information processing apparatus, information output method, and computer-readable recording medium
TWI830193B (en) Forecasting systems, information processing devices and information processing programs
Volkov et al. A Three-Aspects Approach for Technical Systems Quality Evaluation
BR102016023297A2 (en) METHOD FOR MONITORING AND DIAGNOSTICS OF EQUIPMENT FAILURE MODES IN HYDROELECTRIC MACHINES BASED ON A SPECIALIST SYSTEM
JP2000099333A (en) Plant interface agent and plant operation condition monitoring method
Sánchez et al. Online diagnosis using influence diagrams
Athanasopoulou et al. Utilizing data mining algorithms for identification and reconstruction of sensor faults: a Thermal Power Plant case study

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23835203

Country of ref document: EP

Kind code of ref document: A1