WO2018193934A1

WO2018193934A1 - Evaluation apparatus, evaluation method, and program therefor

Info

Publication number: WO2018193934A1
Application number: PCT/JP2018/015244
Authority: WO
Inventors: Tanichi Ando
Original assignee: Omron Corporation
Priority date: 2017-04-20
Filing date: 2018-04-11
Publication date: 2018-10-25
Also published as: JP2018181184A; JP6917004B2

Abstract

An evaluation apparatus (60) configured to evaluate a second learning module obtained by additionally training a first learning module includes: a learning objective acceptance unit (203) configured to accept a learning objective that is to be achieved by the second learning module, an evaluation unit (204) configured to evaluate the second learning module with respect to at least an evaluation item included in the learning objective, and produce evaluation data, a determination unit (205) configured to determine whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data, and a learning module acquisition unit (206) configured to acquire, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.

Description

EVALUATION APPARATUS, EVALUATION METHOD, AND PROGRAM THEREFOR

The present invention relates to an evaluation apparatus, an evaluation method, and a program therefor.
CROSS-REFERENCES TO RELATED APPLICATIONS
This application claims priority to Japanese Patent Application No.2017-083501 filed April 20, 2017, the entire contents of which are incorporated herein by reference.

Performing the control of a system using a neural network technique is conventionally known. For example, JP 2005-255289A discloses an elevator system with which an optimal basket that is to be moved to a stand is selected from a plurality of baskets using a neural network.

Also, JP H9-62648A discloses a method for additionally training a neural network in order to reduce false recognition in a pattern recognition apparatus that uses the neural network.

JP 2005-255289A and JP H9-62648A are examples of background art.

In a control system that uses a neural network technique, in order to increase the control performance, the neural network is additionally trained in some cases. For example, a case is conceivable in which at the time of first training, training data A that is randomly selected from data obtained in a time period from time 1 to time 2 shown in FIG. 11 is used, and thereafter training data B that is randomly selected from data obtained in a time period from the time 2 to time 3 shown in FIG. 11 is used to perform additional training.

In this case, the training data B sometimes includes inappropriate training data due to random selection of training data. When additional training is performed using additional training data B that includes inappropriate training data in this manner, the control performance of the additionally trained neural network sometimes deteriorates compared to the neural network before additional training.

Also, if a special event occurs in the time period from the time 2 to the time 3, there is a possibility that the additionally trained neural network will become specialized for the special event. As a result, there is a risk that the control performance of the additionally trained neural network will deteriorate compared to that of the neural network before additional training.

In view of this, an object of the present invention is to provide an evaluation apparatus, an evaluation method, and a program therefor that can handle a case where the performance of a learning module deteriorates after additional training.

An evaluation apparatus configured to evaluate a second learning module obtained by additionally training a first learning module according to one aspect of the present invention includes a learning objective acceptance unit configured to accept a learning objective that is to be achieved by the second learning module; an evaluation unit configured to evaluate the second learning module with respect to at least an evaluation item included in the learning objective, and produce evaluation data; a determination unit configured to determine whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and a learning module acquisition unit configured to acquire, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.

According to this aspect, if the performance of the learning module has not achieved a learning objective after additional training, it is possible to avoid continuous use of the learning module that has not achieved the learning objective, and thus to increase the reliability of the system in which the learning module is used. For example, avoiding use of the learning module that has not achieved the learning objective makes it possible to prevent a decrease in the processing precision of the overall system. Also, with a configuration in which if the learning objective is not achieved, a learning module that is different from the learning module that has not achieved the learning objective is acquired based on at least the learning objective, neither an apparatus that performs a given processing using a learning module nor the evaluation apparatus that performs evaluation need to hold a plurality of learning modules, and thus the hardware resource for recording the learning modules can be minimized.

In the above-described evaluation apparatus, the third learning module may be the first learning module. According to this aspect, it is possible to continue system processing using a learning module whose performance has not deteriorated, and thus to prevent a decrease in the stability of the system that uses the learning module.

The above-described evaluation apparatus may further include a factor estimation unit configured to estimate a factor that affects failure in achievement of the learning objective. In this case, the factor estimation unit may be configured to produce factor improvement data relating to training data used to additionally train a learning module. According to this aspect, it is possible to obtain useful information from the learning module whose performance has deteriorated. By utilizing such information, it is possible to reduce the amount of training data and processing that are necessary for improving precision in the learning module and to reduce the load applied to the CPU.

In the above-described evaluation apparatus, the learning module acquisition unit may produce a learning instruction based on the factor improvement data. According to this aspect, it is possible to efficiently additionally train the learning module, which will overcome the problem of the learning module whose performance has deteriorated. Using the learning module that was efficiently additionally trained in the system makes it possible to improve performance deterioration caused by updating the learning module, in a minimum time period.

In the above-described evaluation apparatus, the factor estimation unit may output the estimated factor, receive a user input relating to the estimated factor to a display unit, and produce the factor improvement data based on the user input. According to this aspect, information relating to a performance deterioration factor can be produced based on more reliable information.
In the above-described evaluation apparatus, the learning module may be configured to perform control of a system. In this case, the evaluation unit may be configured to obtain sensing data output by a sensor during a time period in which the control of the system is performed by the learning module and produce the evaluation data using the sensing data. According to this aspect, it is possible to objectively evaluate the second learning module and increase the reliability of the system control by the learning module is used.

An evaluation method for evaluating a second learning module obtained by additionally training a first learning module according to another aspect of the present invention includes: accepting, by a computer including a controller, a learning objective that is to be achieved by the second learning module; evaluating, by the computer, the second learning module with respect to at least an evaluation item included in the learning objective, and producing evaluation data; determining, by the computer, whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and acquiring, by the computer, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.

A program according to another aspect of the present invention is a program for causing a computer to evaluate a second learning module obtained by additionally training a first learning module, the program causing the computer to execute: accepting a learning objective that is to be achieved by the second learning module; evaluating the second learning module with respect to at least an evaluation item included in the learning objective, and producing evaluation data; determining whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and acquiring, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module.

According to the present invention, it is possible to provide an evaluation apparatus, an evaluation method, and a program therefor that can handle a case where the performance of a learning module deteriorates after additional training.

FIG. 1 is a diagram showing a network configuration of a learning system according to an embodiment of the present invention. FIG. 2 is a diagram showing a physical configuration of a learning apparatus according to an embodiment of the present invention. FIG. 3 is a diagram showing a physical configuration of an additional training control apparatus according to an embodiment of the present invention. FIG. 4 is a functional block diagram of the learning apparatus according to an embodiment of the present invention. FIG. 5 is a functional block diagram of the additional training control apparatus according to an embodiment of the present invention. FIG. 6 is a flowchart showing the flow of additional training processing executed by the additional training control apparatus according to an embodiment of the present invention. FIG. 7 is a flowchart showing the flow of the additional training processing executed by the additional training control apparatus according to an embodiment of the present invention. FIG. 8 is one example of a screen output by the additional training control apparatus according to an embodiment of the present invention. FIG. 9 is a diagram showing a network configuration of a learning system according to another embodiment of the present invention. FIG. 10 is a diagram showing a physical configuration of an evaluation apparatus according to an embodiment of the present invention. FIG. 11 is a diagram showing a time series when training data is selected.

Embodiments of the present invention will be described below with reference to the accompanying drawings. Note that the embodiments below are merely for facilitating understanding of the present invention and are not intended to limit the interpretation of the present invention. Also, various modifications can be made on the present invention without departing from the gist of the present invention. Furthermore, persons skilled in the art can adopt an embodiment obtained by substituting elements that will be described below with equivalent elements, and such an embodiment is included in the scope of the present invention.

Network configuration
A network configuration of a learning system 1 according to a predetermined embodiment of the present invention will be described below with reference to FIG. 1. The learning system 1 includes a learning apparatus 10, an additional training control apparatus 20, one or more sensors 30, and a storage apparatus 40. The learning apparatus 10 is connected to the additional training control apparatus 20, the one or more sensors 30, and the storage apparatus 40 via a communication network N. The communication network N may be either a wired communication network or a wireless communication network that is configured with a wired line or a radio line, and may be the Internet or a local area network (LAN).

The learning apparatus 10 trains a learning module based on training data stored in the storage apparatus 40, and stores the trained learning module in the storage apparatus 40. The learning apparatus 10 according to the present embodiment includes the learning module, but the learning module may be provided in an apparatus separate from the learning apparatus 10.

Note that the learning module includes one unit of dedicated or general-purpose hardware or software that is provided with a learning capability, or a combination of single units of this hardware or software. Examples of the learning module that performs this learning include a learning module that has performed some kind of learning using the training data and a learning module that has not performed learning. Here, "learning capability" refers to a capability by which a task processing capability can be improved based on experience obtained from the training data.

The additional training control apparatus 20 outputs output data corresponding to the characteristics of input data using the trained learning module. The additional training control apparatus 20 according to the present embodiment acquires, from the learning apparatus 10, the trained learning module or a copy of this trained learning module, and sets it as a learning module. The additional training control apparatus 20 can evaluate the output data that is output using the set learning module, and output evaluation data. The additional training control apparatus 20 can determine whether or not the currently set learning module has achieved a learning objective, using the learning objective that is to be achieved by the set learning module and the evaluation data. If it is determined that the learning objective has not been achieved, the additional training control apparatus 20 may acquire, from the learning apparatus 10, the learning module that was previously set, and set the acquired learning module as a learning module, for example. Note that the additional training control apparatus 20 is provided with a functional configuration of an evaluation apparatus 60, which will be described later, and substantially includes the evaluation apparatus 60.

Note that the copy of the trained learning module may be one unit of dedicated or general-purpose hardware or software that can reproduce the function of the trained learning module, or a combination of single units of this hardware or software.

The copy of the trained learning module is not necessarily required to be provided with a learning capability. Also, the configuration of the trained learning module and the configuration of the copy of the trained learning module are not necessarily required to match each other. Also, the copy of the trained learning module may be a learning module obtained through so-called distillation. That is, the copy of the trained learning module may be another trained learning module that has been obtained by training another learning module having a different structure from the trained learning module, so as to have the function of the trained learning module.

Here, the other learning module may have a simpler structure than the trained module and be more appropriate for deployment, and data output from the trained module may be used in learning of the other learning module. Note that the copy of the trained module may be a trained module obtained by changing a regularization method by which overfitting is prevented in a process during which the learning module performs learning, changing the learning ratio of back-propagation, or changing the weighting coefficient updating algorithm.

Also, acquisition of the trained module or the copy of this trained module refers to acquisition of information required to reproduce the function of the trained module in the additional training control apparatus 20. For example, if the learning module includes a neural network, acquisition of the trained module or the copy of the trained module refers to acquisition of information relating to the number of layers of the neural network, the number of nodes relating to the layers, weight parameters of links connecting nodes, bias parameters relating to the nodes, and a function form of activation function on nodes.

The sensor 30 may be any of a physical quantity sensor that detects a physical quantity, a chemical quantity sensor that detects a chemical quantity, and an information sensor that detects information, but is not limited to these and may include an arbitrary sensor. Examples of the physical quantity sensor include a camera that detects light and outputs image data or movie data, and a vital sensor such as a heart beat sensor that detects the heart beat of a person and outputs the heart beat data, a blood pressure sensor that detects the blood pressure of a person and outputs blood pressure data, or a body temperature sensor that detects the body temperature of a person and outputs body temperature data, and include a sensor that detects any other physical quantity and outputs an electrical signal. Examples of the chemical quantity sensor include a gas sensor, a humidity sensor, and an ion sensor, and include a sensor that detects any other chemical quantity and outputs an electrical signal. Examples of the information sensor include a sensor that detects a specific pattern from statistical data, and include a sensor that detects any other information.

The storage apparatus 40 stores sensing data output by the sensor 30. Also, the storage apparatus 40 stores the trained module output by the learning apparatus 10. Although FIG. 1 shows the storage apparatus 40 as a single storage unit, the storage apparatus 40 may be configured by one or more file servers.

Note that the learning apparatus 10, the additional training control apparatus 20, and the storage apparatus 40 are configured as separate apparatuses in FIG. 1, but these apparatuses may be configured as a single apparatus. That is, the learning apparatus 10, the additional training control apparatus 20, and the storage apparatus 40 may be all configured as a single apparatus, or two out of the learning apparatus 10, the additional training control apparatus 20, and the storage apparatus 40 may be selectively configured as a single apparatus. At this time, elements of the learning apparatus 10, the additional training control apparatus 20, and the storage apparatus 40 that are configured as a single apparatus are connected via internal buses.

Physical configuration: learning apparatus
A physical configuration of the learning apparatus 10 according to a predetermined embodiment of the present invention will be described below with reference to FIG. 2. The learning apparatus 10 has a controller 10a, a storage unit 10b, a communication unit 10c, an input unit 10d, and a display unit 10e. These constituent elements are connected to each other via a bus so as to exchange data with each other.

The controller 10a includes a central processing unit (CPU) corresponding to a hardware processor and a random access memory (RAM) corresponding to a memory. The controller 10a functions as units shown in FIG. 4, which will be described later, by the CPU loading a program stored in the storage unit 10b to the RAM, and interpreting and executing this program loaded to the RAM.

Note that the type of hardware processor is not limited to the CPU. For example, as the hardware processor, the CPU, graphics processing unit (GPU), field-programmable gate array (FPGA), a digital signal processor (DSP), and an application specific integrated circuit (ASIC) can be used alone or in combination. The RAM is a storage unit in which data can be rewritten, and is constituted by a semiconductor memory device, for example. The RAM temporarily stores programs such as applications executed by the CPU and data.

The storage unit 10b is a non-volatile storage medium such as a hard disk drive (HDD) or a solid state drive (SDD). The storage unit 10b stores programs executed by the CPU and data.

The communication unit 10c is a hardware interface that connects the learning apparatus 10 to the communication network N.

The input unit 10d accepts input from a user, and is constituted by a keyboard, a mouse, and a touch panel, for example.

The display unit 10e visually displays a result of processing performed by the CPU, and is constituted by a liquid crystal display (LCD), for example.

The learning apparatus 10 may be configured by the CPU of a general-purpose personal computer executing a learning program according to the present embodiment. The learning program may be provided by being stored in a computer-readable storage medium such as the RAM or the storage unit 10b, or may be provided via the communication network N connected by the communication unit 10c. These physical configurations are merely examples and do not necessarily need to be independent configurations.

Physical configuration: additional training control apparatus
A physical configuration of the additional training control apparatus 20 according to a predetermined embodiment of the present invention will be described with reference to FIG. 3. Similarly to the learning apparatus 10, the additional training control apparatus 20 also includes a controller 20a including a CPU and a RAM, a storage unit 20b that stores data and the like, a communication unit 20c for connection to the network N, an input unit 20d that accepts an input from a user, a display unit 20e, and the like. These constituent elements are connected to each other via a bus so as to exchange data with each other. The controller 20a functions as units shown in FIG. 5, which will be described later, by the CPU loading a program stored in the storage unit 20b to the RAM, and interpreting and executing this program loaded to the RAM.

The additional training control apparatus 20 may be configured by the CPU of a general-purpose personal computer executing an additional training control program, for example. The additional training control program may be provided by being stored in a computer-readable storage medium such as the RAM or the storage unit 20b, or may be provided via the communication network N connected by the communication unit 20c.

Functional configuration: learning apparatus
A functional configuration of the learning apparatus 10 according to a predetermined embodiment of the present invention will be described below with reference to FIG. 4. The learning apparatus 10 includes a learning instruction acceptance unit 101, a training data acquisition unit 102, a learning controller 103, a learning module 104, a trained module output unit 105, and a trained module extraction unit 106.

The learning instruction acceptance unit 101 accepts a learning instruction from a user via the input unit 10d or a learning instruction from the additional training control apparatus 20 via the communication unit 10c, and delivers information included in the learning instruction to the training data acquisition unit 102, which will be described later. In the present embodiment, the learning instruction includes a training data acquisition condition, an input parameter designation, and the like. The training data acquisition condition refers to the condition that is required to satisfy the learning instruction from the user, in the data used as the training data for training the learning module 104. For example, in the data acquired by the sensor 30, a condition that designates acquisition date and time may be used. The input parameter refers to a factor that influences the control performance of the trained module in the information included in the learning instruction.

The training data acquisition unit 102 receives the training data acquisition condition and acquires training data from the storage apparatus 40 based on the received training data acquisition condition.

The learning controller 103 trains the learning module 104 using the training data acquired by the training data acquisition unit 102. The learning controller 103 completes training based on the learning instruction accepted by the learning instruction acceptance unit 101. The standard for determining that the training is complete may be a case where the learning module 104 is trained using a predetermined number of pieces of training data, for example. Also, when the control performance of the trained module meets a learning objective, which will be described later, training may be completed. When the training is complete, the learning controller 103 stores the trained module in the storage apparatus 40. At this time, in the present embodiment, the learning controller 103 stores the trained modules in association with learning module identifiers with which the trained modules can be uniquely identified and training data acquisition conditions.

Here, it is desirable that the version of the trained module stored in the storage apparatus 40 is managed so as to understand the update history. The version may be managed using the learning module identifier or version information that is provided separately.

The learning module 104 is a module for realizing machine learning. Here, a working example in which a neural network is applied as one example of the learning module 104 will be described. However, the neural network is merely one example of the learning module 104, and another configuration may be applied to the learning apparatus 10 as the learning module 104.

The trained module output unit 105 outputs the trained module and the learning module identifier to an external apparatus such as the additional training control apparatus 20.

The trained module extraction unit 106 receives the learning module extraction condition, and acquires the trained module from the storage apparatus 40 based on the received learning module extraction condition. In the present embodiment, the learning module extraction condition includes the learning module identifier of the currently set learning module and an extraction point. The extraction point includes information by which a learning module that is to be extracted can be designated, such as designation of the date before deterioration of the performance or the version of a learning module prior to the currently set learning module. For example, the extraction point may be "before Dec. 31, 2017" or "the immediately previous version of the currently set learning module".

Functional configuration: additional training control apparatus
A functional configuration of the additional training control apparatus 20 according to a predetermined embodiment of the present invention will be described below with reference to FIG. 5. The additional training control apparatus 20 includes a trained module acceptance unit 201, a learning module 202, a learning objective acceptance unit 203, an evaluation unit 204, a determination unit 205, a learning module acquisition unit 206, a factor estimation unit 207, a controller 208, and a database (DB) 209.

The trained module acceptance unit 201 accepts a trained module and a learning module identifier and sets the accepted trained module as the learning module 202. In the present embodiment, the trained module acceptance unit 201 accepts a trained module and a learning module identifier from the trained module output unit 105 of the learning apparatus 10 and sets the accepted trained module as the learning module 202. Note that the trained module acceptance unit 201 may accept the trained module from the storage apparatus 40 and set the accepted trained module as the learning module 202. Here, a working example in which a neural network is applied as one example of the learning module 202 will be described. However, the neural network is merely one example of the learning module 202, and another configuration may be applied to the additional training control apparatus 20 as the learning module 202.

The learning objective acceptance unit 203 accepts a learning objective that is to be achieved by the learning module 202 via the input unit 20d, and stores the accepted learning objective in the DB 209. In the present embodiment, the learning objective includes one or more evaluation items and conditions associated with the respective evaluation items. The evaluation items are items for evaluating the learning module 202, or more specifically the performance of the learning module 202, and are used to determine the accuracy of the output data output from the learning module, for example.

The conditions can be set as the conditions for the evaluations items, for example, with regard to an evaluation item "the number of external operations per day performed on an apparatus controlled by the controller 208", "this evaluation item is a reference value x or less", or "this evaluation item is smaller than the immediately previously set learning module", with regard to an evaluation item "a power consumption per month of an apparatus controlled by the controller 208", "this evaluation item is a reference value y or less", or "this evaluation item is smaller than the immediately previously set learning module", with regard to an evaluation item "the number of instances per month in which a value calculated by the learning module 202 in a predetermined time period exceeds an allowable change ratio", "this evaluation item is a reference value z or less", "this evaluation item is smaller than the immediately previously set learning module", and the like.

The evaluation unit 204 obtain the sensing data stored in the storage apparatus 40 and produces evaluation data by evaluating the learning module 202 using the sensing data stored, and stores the produced evaluation data in the DB 209 in association with the learning module identifier accepted by the trained module acceptance unit 201. In the present embodiment, the evaluation unit 204 can evaluate at least the evaluation items included in the learning objective and produce the evaluation data. The evaluation unit 204 can automatically perform evaluation after a certain time period has elapsed since the learning module 202 is set, or may accept an evaluation instruction from a user and perform evaluation. For example, the evaluation unit 204 may perform evaluation in response to the acceptance of the learning objective. Note that the evaluation unit 204 may perform evaluation with regard to a preset evaluation item in addition to the evaluation items included in the learning objective.

The determination unit 205 uses the learning objective accepted by the learning objective acceptance unit 203 and the evaluation data stored in the DB 209 to determine whether or not the currently set learning module has achieved the learning objective. More specifically, the determination unit 205 determines whether or not the currently set learning module has achieved the learning objective based on whether or not the evaluation data produced for the evaluation item meets the condition associated with the evaluation item. The determination unit 205 may determine whether or not the currently set learning module has achieved the learning objective based on the plurality of evaluations items and conditions as necessary. If it is determined that the learning objective has been achieved, the determination unit 205 ends the processing.

If it is determined that the learning objective has not been achieved, as described later, the learning module selection unit 206 transmits a learning module extraction condition to the trained module extraction unit 106 of the learning apparatus 10, as will be described below, and sets the trained module obtained as a response thereto as the learning module 202. The trained module obtained by the learning module acquisition unit 206 transmitting the learning module extraction condition to the trained module extraction unit 106 may be evaluated for achievement of the learning objective again before the trained module is set as the learning module 202.

Also, the learning module acquisition unit 206 may instruct the factor estimation unit 207, which will be described later, to estimate the factor of failure in the achievement of the learning objective and obtain factor improvement data. The factor improvement data is data for handling the factor to achieve the learning objective. The factor improvement data may include instructions for removing the factor or reducing the influence of the factor. Thereafter, the learning module acquisition unit 206 may use the factor improvement data obtained from the factor estimation unit 207 to instruct the learning apparatus 10 to perform further additional training.

If it is determined that the currently set learning module has not achieved the learning objective, the factor estimation unit 207 estimates the factor of failure in the achievement of the learning objective. In the present embodiment, the factor estimation unit 207 estimates the factor of failure in the achievement of the learning objective using the learning objective and the sensing data of the storage apparatus 40, in response to the instruction from the learning module acquisition unit 206. In order to estimate the factor of failure in the achievement of the learning objective, the sensing data of the storage apparatus 40 is subjected to statistical processing.

The controller 208 performs control using the value calculated by the learning module 202. Although a controller that performs control is described as one example in which system control (control of a system) is performed using the learning module in the present embodiment, as described later, an embodiment of the present invention can be applied to various systems that execute processing using the learning module.

A learning objective DB 2091 and evaluation data DB 2092 are stored in the DB 209. In the present embodiment, evaluation items and conditions that are included in the learning objective accepted by the learning objective acceptance unit 203 are stored in the learning objective DB 2091. Also, the evaluation data produced by the evaluation unit 204 and the learning module identifiers of the learning modules that are to be evaluated are stored in the evaluation data DB 2092 in association with each other.

Additional training processing
First embodiment
A first embodiment of additional training processing executed by the additional training control apparatus 20 will be described with reference to a flowchart shown in FIG. 6. Embodiments of the present invention can be applied to various systems that execute processing using the learning module, and although there are no limitations on the field, an air conditioning control system will be described as an example below.

In a first embodiment, the controller 208 is an air-conditioning controller, for example. The learning module 202 uses, as input parameters, values such as the current room temperature, an ambient temperature, and humidity, which are output by the sensor 30, a known date and time, the volume of a room, and the like, and calculates a room temperature setting value. The controller 208 of the additional training control apparatus 20 performs air-conditioning control using the room temperature setting value calculated by the learning module 202. Here, a working example will be described in which the controller 208 actually performs air-conditioning control using the learning module 202, and the evaluation unit 204 evaluates the result of control performed by the controller 208. However, alternatively, the additional training control apparatus 20 may include a simulation unit (not shown). In an alternative working example, the simulation unit can perform air-conditioning control using the learning module 202, and the evaluation unit 204 can evaluate the result of simulation performed by the simulation unit.

Also, although a working example in which the additional training control apparatus 20 includes the learning module 202 and the controller 208 will be described herein, alternatively, a control apparatus separate from the additional training control apparatus 20 may include the learning module and the controller.

It is assumed that as a result of learning that was performed in advance by the learning controller 103 of the learning apparatus 10, the storage apparatus 40 stores a neural network 0 that was trained using training data in a time period from Jan. 1, 2010 to Dec. 31, 2012, a neural network 1 that was trained using training data in a time period from Jan. 1, 2010 to Dec. 31, 2014, and a neural network 2 that was trained using training data in a time period from Jan. 1, 2010 to Dec. 31, 2016. The neural network 0 is stored in the storage apparatus 40 in association with a learning module identifier 0. The neural network 1 is stored in the storage apparatus 40 in association with a learning module identifier 1. The neural network 2 is stored in the storage apparatus 40 in association with a learning module identifier 2.

It is assumed that the additional training control apparatus 20 sets the neural network 1 as the learning module 202 until Jan. 31, 2017. The sensing data output by the sensor 30 is stored in the storage apparatus 40 at any time. That is, sensing data relating to an input parameter such as the current room temperature or an ambient temperature, and sensing data relating to evaluation items such as the number of external operations that are performed on a room temperature setting value and the power consumption in a time period during which air-conditioning control is performed using the neural network 1 are stored in the storage apparatus 40. Also, as a result of evaluation that was performed previously by the evaluation unit 204, evaluation data 0 of the neural network 0 and evaluation data 1 of the neural network 1 are stored in the evaluation data DB 2092 of the additional training control apparatus 20.

In step S601, the trained module acceptance unit 201 of the additional training control apparatus 20 accepts the neural network and the learning module identifier, and sets the accepted neural network as the learning module 202. In the present embodiment, on Feb. 1, 2017, the trained module acceptance unit 201 accepts the neural network 2 and the learning module identifier 2 from the trained module output unit 105 of the learning apparatus 10, and sets the accepted neural network 2 as the learning module 202. From Feb. 1, 2017 onward, the controller 208 performs air-conditioning control using a room temperature setting value calculated by the set neural network 2. As described before, during this time period, the sensing data output by the sensor 30 during a time period in which air-conditioning control is performed using the neural network 2 is stored in the storage apparatus 40 at any time.

In step S602, the learning objective acceptance unit 203 of the additional training control apparatus 20 accepts a learning objective that is to be achieved by the learning module 202 via the input unit 20d, and stores the accepted learning objective in the learning objective DB 2091 in the DB 209. In the present embodiment, on Apr. 1, 2017, the learning objective acceptance unit 203 accepts the learning objective from an administrator of the additional training control apparatus 20 via the input unit 20d, and stores the accepted learning objective in the learning objective DB 2091. It is assumed that the learning objective includes an evaluation item "the number of external operations per day that are performed on the room temperature setting value" and a condition "is smaller than the immediately previously set learning module".

In step S603, the evaluation unit 204 of the additional training control apparatus 20 evaluates the learning module 202 for the evaluation item "the number of external operations per day that are performed on the room temperature setting value" included in the learning objective and produces evaluation data 1, and stores the produced evaluation data 1 in the evaluation data DB 2092 in DB 209 in association with the learning module identifier accepted by the trained module acceptance unit 201. In the present embodiment, in response to the acceptance of the learning objective in step S602, the evaluation unit 204 evaluates the neural network 2 using the sensing data stored in the storage apparatus 40 in a time period during which air-conditioning control is performed using the neural network 2.

Here, the evaluation unit 204 may perform evaluation on another preset evaluation item in addition to the evaluation items included in the learning objective accepted in step S602. Such a configuration makes it possible to handle a case where in the future learning objective, comparison conditions with the immediately previously set learning module are designated with regard to different evaluation items. In the present embodiment, the evaluation unit 204 evaluates another preset evaluation item in addition to the evaluation item "the number of external operations per day that are performed on the room temperature setting value" included in the learning objective, produces evaluation data 2, and stores the evaluation data 2 in the evaluation data DB 2092 in association with the learning module identifier 2.

Next, in step S604, the determination unit 205 of the additional training control apparatus 20 determines whether or not the currently set learning module achieved the learning objective, using the learning objective accepted by the learning objective acceptance unit 203 in step S602 and the evaluation data stored in the evaluation data DB 2092. If it is determined that the learning objective is achieved (step S604: Yes), the additional training control apparatus 20 ends the processing.

In the present embodiment, the determination unit 205 determines whether or not the neural network 2 achieves the learning objective, using the learning objective including the evaluation item "the number of external operations per day that are performed on the room temperature setting value" and the condition "is smaller than the immediately previously set learning module", the evaluation data 1, and the evaluation data 2. More specifically, the determination unit 205 determines whether the evaluation data 1 including data indicating the evaluation item included in the learning objective meets the corresponding condition, i.e., whether or not "the number of external operations per day that are performed on the room temperature setting value is smaller than the immediately previously set learning module", and whether the evaluation data 2 meets the corresponding condition. As a result, it is determined that at least the evaluation data 1 does not meet the corresponding condition and the learning objective has not been achieved (step S604: No), and the processing advances to step S605.

In step S605, the learning module acquisition unit 206 of the additional training control apparatus 20 transmits the learning module extraction condition to the trained module extraction unit 106 of the learning apparatus 10, sets the learning module obtained as a response thereto as the learning module 202, and ends the processing. In this embodiment, the learning module acquisition unit 206 transmits, to the trained module extraction unit 106, the learning module extraction condition "the learning module identifier of the currently set learning module: learning module identifier 2, the extraction point: version of a learning module that was used prior to the currently set learning module", and sets the neural network 1 obtained as a response thereto as the learning module 202.

Second embodiment
In the first embodiment, an example was described in which the neural network 2 that was determined to have not achieved the learning objective was reverted to the neural network 1 that is the version before update. In a second embodiment, an example will be described in which if it is determined that the learning objective has not been achieved, a neural network is further additionally trained.

FIG. 7 is a flowchart showing the flow of additional training processing executed by the additional training control apparatus 20. In the second embodiment, description of matters shared with the first embodiment is omitted, and only different points will be described. Steps S701 to S705 in FIG. 7 are the same processes as steps S601 to S605 in FIG. 6, and thus detailed description of these processes will be omitted.

In step S706, the learning module acquisition unit 206 instructs the factor estimation unit 207 of the additional training control apparatus 20 to estimate the factor of failure in the achievement of the learning objective, and the factor estimation unit 207 estimates the factor of failure in the achievement of the learning objective. In the present embodiment, the factor estimation unit 207 estimates the factor using the sensing data of the storage apparatus 40 with regard to the evaluation item "the number of external operations per day that are performed on the room temperature setting value" included in the learning objective. For example, the factor estimation unit 207 estimates the factor using sensing data obtained in a time period from Jan. 1, 2010 to Dec. 31, 2016, which is the time period of training data used for the neural network 2.

In the present embodiment, the factor estimation unit 207 uses a known algorithm to find a conceivable pattern about the date on which the evaluation item "the number of external operations per day that are performed on the room temperature setting value" included in the learning objective is higher than values of the other days. For example, it is assumed that the pattern that "the number of external operations per day that are performed on the room temperature setting value" is large in a specific time period "from Jul. 1, 2016 to Jul. 31, 2016" is found. Note that the pattern does not only depend on a time period, and the pattern that "a value of an input parameter" is large or small when the value included in the input parameters, such as ambient temperature or humidity, continues to differ from a predetermined value may be found. Also, if the volume of a room included in the input parameters changed, the pattern that "the volume of the room" changed may be found.

In step S707, the factor estimation unit 207 determines whether or not the outlier pattern is found. If an outlier pattern is not found (step S707: No), the additional training control apparatus 20 ends the processing. On the other hand, if an outlier pattern is found (step S707: Yes), the processing advances to step S708, and the factor estimation unit 207 determines the found outlier pattern as the estimated factor and may output the found outlier pattern. As shown in FIG. 8, in the present embodiment, a factor confirmation screen is output which includes a text 801 "the number of external operations per day that are performed on the room temperature setting value is large from Jul. 1, 2016 to Jul. 31, 2016.", which shows the found outlier pattern, a performance deterioration factor radio button 802 that sets whether or not the pattern is the performance deterioration factor, an advanced setting button 803, and a determination button 804. The advanced setting button will be described later.

In step S709, the factor estimation unit 207 determines whether or not the user input has been received. If the user input has been received, the processing advances to step S710, the factor estimation unit 207 produces factor improvement data based on the user input, and returns the produced factor improvement data to the learning module acquisition unit 206. Among the outlier patterns output from the factor estimation unit 207 in step S708, the factor improvement data includes a pattern that is not desirable to be included in the training data due to selection of the performance deterioration factor radio button 802 as the performance deterioration factor.

In the present embodiment, it is assumed that the administrator of the additional training control apparatus 20 confirms that interior construction was performed in a time period from Jul. 1, 2016 to Jul. 31, 2016, selects the performance deterioration factor radio button 802 as the performance deterioration factor on the screen output in step S708, and presses the determination button 804. As a result, the factor estimation unit 207 produces factor improvement data "exclude a time period from Jul. 1, 2016 to Jul. 31, 2016" based on the received user input, and returns the produced factor improvement data to the learning module acquisition unit 206.

In step S711, the learning module acquisition unit 206 produces a learning instruction based on the factor improvement data, transmits the produced learning instruction to the learning instruction acceptance unit 101 of the learning apparatus 10, and ends the processing. In the present embodiment, the learning module acquisition unit 206 transmits a learning instruction that includes "training data acquisition condition: exclude a time period from Jan. 1, 2010 to Dec. 31, 2016 and from Jul. 1, 2016 to Jul. 31, 2016" to the learning instruction acceptance unit 101.

Note that in the present embodiment, if it is determined in step S707 that an outlier pattern is found, the factor improvement data is produced based on the user input in step S710, but a configuration may be adopted in which steps S708 to S710 are omitted, the user input is not accepted, and the factor estimation unit 207 produces factor improvement data based on the found outlier pattern.

By doing this, the learning apparatus 10 can further additionally train the neural network based on the received learning instruction. In the present embodiment, the learning apparatus 10 can further additionally train the neural network 1 using the training data from which a time period from Jul. 1, 2016 to Jul. 31, 2016 is excluded.

Third embodiment
In the second embodiment, an example was described in which if it is determined that the learning objective has not been achieved, a neural network is further additionally trained, excluding data that is not desirable to be included in the training data. In a third embodiment, an example will be described in which if it is determined that the learning objective has not been achieved, data that is used to further train the neural network is added, and the neural network is further additionally trained.

The third embodiment will be described with reference to FIG. 7. In the third embodiment as well, description of matters shared with the first embodiment and the second embodiment is omitted, and only different points will be described. Steps S701 to S705 in FIG. 7 are the same processes as those in the first embodiment, and thus detailed description of these processes will be omitted.

In step S706, the factor estimation unit 207 estimates the factor of failure in the achievement of the learning objective. In the present embodiment, it is assumed that the pattern that "the number of external operations that are performed on the room temperature setting value" is large in a specific time slot "from 8:00 to 9:00 on every Saturday" is found. Note that the pattern does not only depend on a time period, and the pattern that "the value of an input parameter" is large or small when the value included in the input parameter, such as ambient temperature or humidity, continues to differ from a predetermined value may be found. Also, if the volume of a room included in the input parameters changed, the pattern that "the volume of the room" changed may be found.

In step S707, the factor estimation unit 207 determines whether or not an outlier pattern is found. In the present embodiment, because an outlier pattern is found and thus the processing advances to step S708, and the factor estimation unit 207 may output the found outlier pattern to the display unit 10e. In the present embodiment, a factor confirmation screen that includes a text 801 "the number of external operations that are performed on the room temperature setting value is large from 8:00 to 9:00 on every Saturday." that indicates the found outlier pattern, a performance deterioration factor radio button 802, an advanced setting button 803, and a determination button 804 is output to the display unit 10e.

In the present embodiment, it is assumed that because a smaller number of users utilize the room on every Saturday than on weekdays, the administrator of the additional training control apparatus 20 confirms that it is desirable to set the room temperature setting value to a higher value compared to that on weekdays, selects the performance deterioration factor radio button 802 as a non-performance deterioration factor, and presses the advanced setting button 803 on the screen output in step S709. For example, by pressing the advanced setting button 803, the advanced setting screen is displayed is displayed on the display unit 10e, and the administrator may adjust the input parameters on this advanced setting screen. Alternatively, the additional training control apparatus 20 may adjust the input parameters.

In step S709, the factor estimation unit 207 determines whether or not user input has been received. In the present embodiment, it is assumed that the administrator adjusts the input parameters on the advanced setting screen, and thereafter presses the determination button 804. As a result, the processing advances to step S710, the factor estimation unit 207 produces factor improvement data based on the user input, and returns the produced factor improvement data to the learning module acquisition unit 206. In the present embodiment, the factor improvement data includes settings relating to the input parameters designated on the advanced setting screen.

In step S711, the learning module acquisition unit 206 produces a learning instruction based on the factor improvement data, transmits the produced learning instruction to the learning instruction acceptance unit 101 of the learning apparatus 10, and ends the processing. In the present embodiment, the learning module acquisition unit 206 transmits a learning instruction that includes the settings relating to the input parameters designated on the advanced setting screen to the learning instruction acceptance unit 101.

By doing this, the learning apparatus 10 can additionally train the neural network based on the received learning instruction. In the present embodiment, if the pattern found by the additional training control apparatus 20 is not a performance deterioration factor, and is used to further train the neural network, the learning apparatus 10 can further additionally train the neural network 2 using additional training data.

Another embodiment
As described above, in another embodiment, as shown in FIG. 9, the learning system 1 may include a control apparatus 50 provided with a trained module acceptance unit 201, a learning module 202, and a controller 208. In the present embodiment in which the learning system 1 includes the control apparatus 50, the learning system 1 includes an evaluation apparatus 60 provided with a learning objective acceptance unit 203, an evaluation unit 204, a determination unit 205, a learning module acquisition unit 206, a factor estimation unit 207, and a DB 209. As described above, the additional training control apparatus 20 includes the evaluation apparatus 60.

As shown in FIG. 10, the evaluation apparatus 60 includes a controller 60a including a CPU and a RAM, a storage unit 60b that stores data and the like in DB 209, a communication unit 60c for connection to a network N, an input unit 60d that accepts an input from a user, a display unit 60e, and the like. These constituent elements are connected to each other via a bus so as to exchange data with each other. The controller 60a functions as the units shown in FIG. 9 by the CPU loading a program stored in the storage unit to the RAM, and interpreting and executing this program loaded to the RAM.

The evaluation apparatus 60 may be configured by the CPU of a general-purpose personal computer executing an additional training control program, for example. The additional training program may be provided by being stored in a computer-readable storage medium such as the RAM or the storage unit 60b, or may be provided via the communication network N connected by the communication unit.

A program that implements the processes described in this specification may be stored in a recording medium. Use of this recording medium makes it possible to cause a computer to function as the evaluation apparatus 60 or the additional training control apparatus 20 by installing the above-described program in this computer. Here, the recording medium in which the above-described program is stored may be a non-transitory recording medium. Although there are no particular limitations on the non-transitory recording medium, a recording medium such as a CD-ROM may be used, for example.

Part or all of the above-described embodiments will be described as the following additional remarks, but are not limited thereto.

Additional Remark 1
An evaluation apparatus that includes at least one memory and at least one hardware processor that is connected to the memory, and evaluates a second learning module obtained by additionally training a first learning module,
wherein the hardware processor
accepts a learning objective that is to be achieved by the second learning module,
evaluates the second learning module with respect to at least an evaluation item included in the learning objective, and produces evaluation data,
determines whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data, and
acquires, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.

Additional Remark 2
An evaluation method for evaluating a second learning module obtained by additionally training a first learning module, the evaluation method including;
accepting, by at least one hardware processor, a learning objective that is to be achieved by the second learning module;
evaluating, by the hardware processor, the second learning module with respect to at least an evaluation item included in the learning objective, and producing evaluation data;
determining, by the hardware processor, whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and
acquiring, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.

Claims

An evaluation apparatus configured to evaluate a second learning module obtained by additionally training a first learning module, the evaluation apparatus comprising:
a learning objective acceptance unit configured to accept a learning objective that is to be achieved by the second learning module;
an evaluation unit configured to evaluate the second learning module with respect to at least an evaluation item included in the learning objective, and produce evaluation data;
a determination unit configured to determine whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and
a learning module acquisition unit configured to acquire, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.
The evaluation apparatus according to claim 1,
wherein the third learning module is the first learning module.
The evaluation apparatus according to claim 1 or 2, further comprising:
a factor estimation unit configured to estimate a factor that affects failure in achievement of the learning objective.
The evaluation apparatus according to claim 3,
wherein the factor estimation unit is configured to produce factor improvement data relating to training data used to additionally train a learning module, based on the estimated factor.
The evaluation apparatus according to claim 4,
wherein the learning module acquisition unit produces a learning instruction based on the factor improvement data.
The evaluation apparatus according to claim 4 or 5,
wherein the factor estimation unit outputs the estimated factor to a display unit, receives a user input relating to the estimated factor, and produces the factor improvement data based on the user input.
The evaluation apparatus according to any one of claims 1 to 6,
wherein the learning module is configured to perform control of a system.
The evaluation apparatus according to claim 7,
wherein the evaluation unit is configured to obtain sensing data output by a sensor during a time period in which the control of the system is performed by the learning module and produce the evaluation data using the sensing data.
An evaluation method for evaluating a second learning module obtained by additionally training a first learning module, the evaluation method comprising:
accepting, by a computer including a controller, a learning objective that is to be achieved by the second learning module;
evaluating, by the computer, the second learning module with respect to at least an evaluation item included in the learning objective, and producing evaluation data;
determining, by the computer, whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and
acquiring, by the computer, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module, based on at least the learning objective.
A program for causing a computer to evaluate a second learning module obtained by additionally training a first learning module, the program causing the computer to execute:
accepting a learning objective that is to be achieved by the second learning module;
evaluating the second learning module with respect to at least an evaluation item included in the learning objective, and producing evaluation data;
determining whether or not the second learning module has achieved the learning objective using the learning objective and the evaluation data; and
acquiring, if it is determined that the learning objective is not achieved, a third learning module that is different from the second learning module.