WO2021192280A1 - Learning device and inference device for air-conditioning control - Google Patents

Learning device and inference device for air-conditioning control Download PDF

Info

Publication number
WO2021192280A1
WO2021192280A1 PCT/JP2020/014248 JP2020014248W WO2021192280A1 WO 2021192280 A1 WO2021192280 A1 WO 2021192280A1 JP 2020014248 W JP2020014248 W JP 2020014248W WO 2021192280 A1 WO2021192280 A1 WO 2021192280A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
air conditioning
equipment
learning
conditioning system
Prior art date
Application number
PCT/JP2020/014248
Other languages
French (fr)
Japanese (ja)
Inventor
貴則 京屋
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to CN202080095692.6A priority Critical patent/CN115280077B/en
Priority to JP2022510379A priority patent/JP7414964B2/en
Priority to PCT/JP2020/014248 priority patent/WO2021192280A1/en
Publication of WO2021192280A1 publication Critical patent/WO2021192280A1/en

Links

Images

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/62Control or safety arrangements characterised by the type of control or by internal processing, e.g. using fuzzy logic, adaptive control or estimation of values

Definitions

  • This disclosure relates to an air conditioning control learning device and an inference device.
  • Patent Document 1 Japanese Patent Application Laid-Open No. 5-256493
  • PMV Predicted Mean Vote
  • Patent Document 1 In general, the air conditioning control in the factory is often constant or manually controlled according to the sensibilities of the workers. Further, the thermal environment index PMV (Predicted Mean Vote) value of Patent Document 1 is comfort as a general solution, and the correlation with the productivity of workers is unknown. Patent Document 1 does not consider improving the productivity of factory workers.
  • PMV Predicted Mean Vote
  • This disclosure was made to solve the above-mentioned problems, and the purpose is to improve the productivity of factory workers.
  • the learning device learns the control of the air conditioning system of a factory including at least one facility.
  • the learning device includes a first data acquisition unit and a model generation unit.
  • the first data acquisition unit acquires learning data including a first parameter representing the state of at least one facility and an air conditioning system and a second parameter relating to the intensity of air conditioning of the air conditioning system.
  • the model generation unit generates a trained model that infers the second parameter from the first parameter using the training data.
  • the first parameter relates to the identification information of the worker performing the work in each of at least one equipment, the item of the product produced by at least one equipment, the identification information of at least one equipment, the takt time of the product, and the quality of the product. Includes information and information about the time when the first parameter was acquired.
  • the inference device outputs the control of the air conditioning system of the factory including at least one facility.
  • the inference device includes a data acquisition unit and an inference unit.
  • the data acquisition unit acquires a first parameter representing the state of at least one facility and the air conditioning system.
  • the inference unit outputs the second parameter from the first parameter acquired by the data acquisition unit using a learned model that infers the second parameter related to the air conditioning intensity of the air conditioning system from the first parameter.
  • the first parameter relates to the identification information of the worker performing the work in each of at least one equipment, the item of the product produced by at least one equipment, the identification information of at least one equipment, the takt time of the product, and the quality of the product. Includes information and information about the time when the first parameter was acquired.
  • the first parameter is the identification information of the worker who works in each of the at least one equipment, the item of the product produced by the at least one equipment, and the at least one equipment.
  • the identification information By including the identification information, the takt time of the product, the information on the quality of the product, and the information on the time when the first parameter is acquired, the productivity of the workers in the factory can be improved.
  • FIG. 1 is a block diagram showing an example of the configuration of a management server 10 including a learning device 100 and an inference device 200 according to an embodiment, and an air conditioning system 20 and a factory 30 controlled by the management server 10.
  • FIG. 2 is a diagram showing an example of working hours, equipment identification information, items, worker identification information, expected optimum temperature, and air conditioning intensity.
  • the factory 30 includes equipment Eq1, Eq2, Eq3.
  • the work Wrk is shipped as a product Prd via a work process in the order of equipment Eq1 to Eq3.
  • the workers Op1, Op2, and Op3 are performing the work.
  • the management server 10 includes an information processing system 11 and a data collection / processing system 12.
  • the information processing system 11 includes a learning device 100 and an inference device 200.
  • the management server 10 acquires the temperature and humidity of the equipment Eq1, the temperature and humidity of the equipment Eq2, and the temperature and humidity of the equipment Eq3 from the temperature and humidity sensors Sn1, Sn2, and Sn3 by wireless communication, respectively.
  • the management server 10 acquires the temperature and humidity of the outdoor unit 21 from the temperature / humidity sensor Sn10 via the air conditioning controller 23 by wired communication.
  • the management server 10 acquires the temperature and humidity of the indoor unit 22 from the temperature / humidity sensor Sn11 via the data collection / processing system 12 by wired communication.
  • the management server 10 acquires the air conditioning control estimation parameter Prm1 (first parameter) at the production site.
  • the air conditioning control estimation parameter Prm1 is the identification information of the workers Op1 to Op3 who perform the work in each of the equipments Eq1 to Eq3, the item of the product Prd produced by the equipments Eq1 to Eq3, the identification information of the equipments Eq1 to Eq3, and the product Prd. Includes information about takt time, product Prd quality, and time when the air conditioning control estimation parameter Prm1 was acquired.
  • the information on the quality of the product Prd includes, for example, the result of the quality inspection performed in the inspection process or the information on the yield.
  • the air-conditioning control estimation parameter Prm1 may include images of the workers Op1 to Op3 during their respective operations.
  • the air conditioner system 20 includes an outdoor unit 21, an indoor unit 22, and an air conditioner controller 23.
  • the outdoor unit 21 is arranged outside the factory 30.
  • the indoor unit 22 and the air conditioning controller 23 are arranged in the factory 30.
  • the outdoor unit 21 includes a fan, a compressor, and a heat exchanger.
  • the indoor unit 22 includes a fan, a heat exchanger and an expansion valve.
  • the air conditioning controller 23 includes a thermostat.
  • the air conditioning controller 23 receives the air conditioning intensity control parameter Prm2 (second parameter) from the management server 10 and controls the outdoor unit 21 and the indoor unit 22.
  • the air conditioning intensity control parameter Prm2 includes ON / OFF of the thermostat, the rotation frequency of the compressor, the wind power of the fan, the evaporation temperature of the refrigerant, and the condensation temperature of the refrigerant.
  • FIG. 3 is a block diagram showing the configuration of the learning device 100 of FIG.
  • the learning device 100 includes a data acquisition unit 110 (first data acquisition unit) and a model generation unit 120.
  • the data acquisition unit 110 acquires, and the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2 as learning data.
  • the model generation unit 120 learns the air conditioning intensity control by using the learning data including the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2. That is, the model generation unit 120 generates a learned model that infers the air conditioning intensity control parameter Prm2 from the air conditioning control estimation parameter Prm1.
  • known algorithms such as supervised learning, unsupervised learning, and reinforcement learning can be used. In the following, as an example, the case where reinforcement learning is applied will be described.
  • reinforcement learning an agent (behavior) in a certain environment observes the current state (environmental parameters) and decides the action to be taken. The environment changes dynamically depending on the behavior of the agent, and the agent is rewarded according to the change in the environment.
  • the agent repeats this process and learns the action policy that gives the most reward through a series of actions.
  • Q-learning or TD-learning is known as a typical method of reinforcement learning.
  • action value function Q (s t, a t) general update equations for is expressed as the following equation (1).
  • s t represents the state of the environment at time t
  • a t represents the behavior in time t.
  • the action a t the state is changed to s t + 1 from the s t.
  • rt + 1 represents the reward obtained by changing the state
  • represents the discount rate
  • represents the learning coefficient.
  • is in the range of 0 ⁇ ⁇ 1
  • is in the range of 0 ⁇ ⁇ 1.
  • Air conditioning intensity control parameters Prm2 the action a t
  • the air conditioning control estimation parameters Prm1 of the production site becomes the state s t.
  • Agent while repeating the update of the action value function shown in equation (1) Q (s, a ), to learn the best action a t in state s t at time t.
  • the update formula represented by the equation (1) has an action value when the Q value of the action a having the highest action value Q (evaluation value) at time t + 1 is larger than the action value Q of the action a executed at time t. Increase Q.
  • the update formula reduces the action value Q.
  • the action value function Q (s, a) is updated so that the action value Q of the action a at time t approaches the best action value at time t + 1.
  • the best behavioral value in a certain environment is sequentially propagated to the behavioral value in the previous environment.
  • the model generation unit 120 includes a reward calculation unit 121 and a function update unit 122.
  • the reward calculation unit 121 calculates the reward using the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2.
  • the reward calculation unit 121 calculates the reward r according to the increase or decrease in productivity, which represents the number of product Prds actually produced in the factory 30 (for example, pieces / hour) per unit time.
  • the reward calculation unit 121 corresponds to the degree of deviation between the productivity of the factory 30 and the total of the individual standard productivity of each of the workers Op1 to Op3, or the productivity of the factory 30 and the standard tact time.
  • the reward r is calculated according to the degree of deviation from the standard productivity.
  • the reward r is increased (for example, the reward of "1" is given), while if the productivity of the factory 30 is lower than the previous time, the reward is increased.
  • Reduce r for example, give a reward of "-1").
  • the function update unit 122 updates the function for determining the air conditioning intensity control parameter Prm2 according to the reward calculated by the reward calculation unit 121, and outputs the function to the trained model storage unit 140.
  • action value function Q (s t, a t) represented by the formula (1) is used as a function for calculating the air-conditioning power control parameter Prm2.
  • the learning device 100 repeatedly executes the above learning.
  • Learned model storage unit 140 action value is updated by the function updating unit 122 function Q (s t, a t) for storing the learned model is.
  • FIG. 4 is a flowchart showing the learning process of the learning device 100 of FIG. In the following, the step is simply referred to as S.
  • the data acquisition unit 110 acquires the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2 as learning data. Specifically, the data acquisition unit 110 sets the identification information of the equipment in which the worker is working, the reference tact time according to the worker, and the working time in the identification information of each of the workers Op1 to Op3. It is given, and the position information and the time information at which the temperature and humidity were measured are given to the temperature and humidity.
  • the model generation unit 120 calculates the reward using the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2. Specifically, the reward calculation unit 121 acquires the air-conditioning control estimation parameter Prm1 and the air-conditioning intensity control parameter Prm2, and determines the degree of deviation between the standard productivity, which is a predetermined reward standard, and the actual productivity of the factory 30. Based on this, it is determined whether to increase the reward corresponding to the air conditioning intensity control parameter Prm2 (S103) or decrease the reward (S104). The reward calculation unit 121 increases the reward in S103 when the productivity of the actual factory 30 is larger than the standard productivity. On the other hand, when the productivity of the actual factory 30 is smaller than the standard productivity, the reward calculation unit 121 reduces the reward in S104.
  • the standard productivity which is a predetermined reward standard
  • a standard may be used in which the reward is increased when the yield of the product Prd is larger than the standard yield, and the reward is decreased when the yield is smaller than the standard yield. As a result, the quality of the product Prd can be improved.
  • the function updater 122 uses the calculated fees and wherein the compensation calculation unit 121 (1), the behavior learned model storage unit 140 stores value function Q (s t, a t) Update ..
  • Learning apparatus 100 repeatedly executes the steps of the above S101 to S105, and stores the generated action-value function Q (s t, a t) as a learned model.
  • the learned model is stored in the learned model storage unit 140 provided outside the learning device 100, but the learned model storage unit 140 is formed inside the learning device 100. You may.
  • FIG. 5 is a block diagram showing the configuration of the inference device 200 of FIG.
  • the inference device 200 includes a data acquisition unit 210 and an inference unit 220.
  • the data acquisition unit 210 acquires the air conditioning control estimation parameter Prm1.
  • the inference unit 220 infers the air conditioning intensity control parameter Prm2 by using the learned model stored in the learned model storage unit 140. That is, by inputting the air conditioning control estimation parameter Prm1 of the production site acquired by the data acquisition unit 210 into the trained model, it is possible to infer the air conditioning intensity control parameter Prm2 suitable for the air conditioning control estimation parameter Prm1 of the production site.
  • the configuration for inferring the air conditioning intensity control parameter Prm2 using the trained model learned by the model generation unit 120 in FIG. 3 has been described, but the trained model trained in another environment is used.
  • the air conditioning intensity control parameter may be output.
  • FIG. 6 is a flowchart showing the inference process of the inference device 200 of FIG.
  • the data acquisition unit 210 acquires the air conditioning control estimation parameter Prm1 at the production site.
  • the inference unit 220 inputs the air conditioning control estimation parameter Prm1 at the production site into the learned model stored in the learned model storage unit 140, obtains the air conditioning intensity control parameter Prm2, and sets the air conditioning intensity control parameter Prm2 in S203. Output to the air conditioning system 20.
  • the air conditioning system 20 uses the air conditioning intensity control parameter Prm2 output from the inference device 200 to perform air conditioning control having an intensity that increases the amount of productivity change predicted in the near future.
  • the learning algorithm is not limited to reinforcement learning.
  • the learning algorithm in addition to reinforcement learning, supervised learning, unsupervised learning, semi-supervised learning, and the like can also be applied.
  • model generation unit 120 As a learning algorithm used in the model generation unit 120, deep learning, which learns the extraction of the feature amount itself, can also be used, and other known methods such as neural networks, genetic programming, and functions can be used. Machine learning may be performed according to logical programming or a support vector machine.
  • the learning device 100 and the inference device 200 may be devices separate from the air conditioning system 20 that are connected to the air conditioning system 20 via a network, for example. Further, the learning device 100 and the inference device 200 may be built in the air conditioning system 20. Further, the learning device 100 and the inference device 200 may exist on the cloud server.
  • the persona of the worker is set from multiple viewpoints such as age, skill level, and gender (for example, a new male in his 20s), and the work for each persona is performed.
  • Worker data may be simplified by setting a person model.
  • the data configuration of the air conditioning control estimation parameter Prm1 may be simplified by preparing a plurality of models of factories, equipment, and lines in advance.
  • the model generation unit 120 may learn the air conditioning intensity control by using the learning data acquired from the plurality of air conditioning systems 20.
  • the model generation unit 120 may acquire learning data from a plurality of air conditioning systems 20 used in the same area, or may collect learning data from a plurality of air conditioning systems 20 operating independently in different areas.
  • the air conditioning intensity control may be learned by using the data.
  • the air conditioning system 20 that collects learning data can be added to the learning target or removed from the learning target on the way.
  • the learning device 100 that has learned the air conditioning intensity control for a certain air conditioning system 20 is applied to another air conditioning system 20, and the air conditioning intensity control is relearned and updated for the other advanced air conditioning system. You may.
  • FIG. 7 is a block diagram showing a hardware configuration of the information processing system 11 of FIG.
  • the information processing system 11 includes a processing circuit 51, a memory 52 (storage unit), and an input / output unit 53.
  • the processing circuit 51 includes a CPU (Central Processing Unit) that executes a program stored in the memory 52.
  • the processing circuit 51 may include a GPU (Graphics Processing Unit).
  • the function of the information processing system 11 is realized by software, firmware, or a combination of software and firmware.
  • the software or firmware is described as a program and stored in the memory 52.
  • the processing circuit 51 reads and executes the program stored in the memory 52.
  • the CPU is also called a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, a processor, or a DSP (Digital Signal Processor).
  • DSP Digital Signal Processor
  • the memory 52 includes a non-volatile or volatile semiconductor memory (for example, RAM (Random Access Memory), ROM (Read Only Memory), flash memory, EPROM (Erasable Programmable Read Only Memory), or EPROM (Electrically Erasable Programmable Read Only Memory). )), And includes magnetic discs, flexible discs, optical discs, compact discs, mini discs, or DVDs (Digital Versatile Discs).
  • the memory 52 stores, for example, a trained model, an air conditioning program, and a machine learning program.
  • the input / output unit 53 receives an operation from the user and outputs the processing result to the user.
  • the input / output unit 53 includes, for example, a mouse, a keyboard, a touch panel, a display, and a speaker.
  • the productivity of factory workers can be improved.
  • 10 management server 11 information processing system, 12 data collection / processing system, 20 air conditioning system, 21 outdoor unit, 22 indoor unit, 23 air conditioning controller, 30 factory, 51 processing circuit, 52 memory, 53 input / output unit, 100 learning device , 110, 210 data acquisition unit, 120 model generation unit, 121 reward calculation unit, 122 function update unit, 140 learned model storage unit, 200 inference device, 220 inference unit, Eq1 to Eq3 equipment, Op1 to Op3 workers, Prd Product, Sn1 to Sn3, Sn10, Sn11 temperature and humidity sensor, Wrk work.

Abstract

This learning device (100) learns to control the air conditioning system of a factory that includes at least one piece of equipment. The learning device (100) comprises a first data acquisition unit (110) and a model generation unit (120). The first data acquisition unit (110) acquires learning data including a first parameter (Prm1) representing the state of the at least one piece of equipment and the air conditioning system and a second parameter (Prm2) relating to the intensity of air conditioning of the air conditioning system. The model generation unit (120) generates a trained model to infer the second parameter (Prm2) from the first parameter (Prm1) using the learning data. The first parameter (Prm1) includes the identification information of each operator performing work with the at least one piece of equipment, the items of products produced by the at least one piece of equipment, the identification information of the at least one piece of equipment, product takt time, information relating to product quality, and information relating to the time at which the first parameter was obtained.

Description

空調制御の学習装置および推論装置Air conditioning control learning device and inference device
 本開示は、空調制御の学習装置および推論装置に関する。 This disclosure relates to an air conditioning control learning device and an inference device.
 従来、空調対象の空間の快適性を向上させる空調制御が知られている。たとえば、特開平5-256493号公報(特許文献1)には、風速センサ等による空調制御によって、温熱環境指数PMV(Predicted Mean Vote)値を用いて空調ゾーンを安定して快適に維持する構成が開示されている。 Conventionally, air conditioning control that improves the comfort of the space to be air-conditioned is known. For example, Japanese Patent Application Laid-Open No. 5-256493 (Patent Document 1) has a configuration in which an air-conditioning zone is stably and comfortably maintained by using a thermal environment index PMV (Predicted Mean Vote) value by air-conditioning control by a wind speed sensor or the like. It is disclosed.
特開平5-256493号公報Japanese Unexamined Patent Publication No. 5-256493
 一般的に、工場内の空調制御は、一定、もしくは作業者の感性に応じた手動による制御が行われる場合が多い。また、特許文献1の温熱環境指数PMV(Predicted Mean Vote)値は一般解としての快適性であり、作業者の生産性との相関は不明である。特許文献1においては、工場の作業者の生産性を向上させることについて考慮されていない。 In general, the air conditioning control in the factory is often constant or manually controlled according to the sensibilities of the workers. Further, the thermal environment index PMV (Predicted Mean Vote) value of Patent Document 1 is comfort as a general solution, and the correlation with the productivity of workers is unknown. Patent Document 1 does not consider improving the productivity of factory workers.
 本開示は、上述のような課題を解決するためになされたものであり、その目的は、工場の作業者の生産性を向上させることである。 This disclosure was made to solve the above-mentioned problems, and the purpose is to improve the productivity of factory workers.
 本開示の一局面に係る学習装置は、少なくとも1つの設備を含む工場の空調システムの制御を学習する。学習装置は、第1データ取得部と、モデル生成部とを備える。第1データ取得部は、少なくとも1つの設備および空調システムの状態を表す第1パラメータと、空調システムの空調の強度に関する第2パラメータとを含む学習用データを取得する。モデル生成部は、学習用データを用いて、第1パラメータから、第2パラメータを推論する学習済みモデルを生成する。第1パラメータは、少なくとも1つの設備の各々において作業を行う作業者の識別情報、少なくとも1つの設備によって生産される製品の品目、少なくとも1つの設備の識別情報、製品のタクトタイム、製品の品質に関する情報、および第1パラメータが取得された時刻に関する情報を含む。 The learning device according to one aspect of the present disclosure learns the control of the air conditioning system of a factory including at least one facility. The learning device includes a first data acquisition unit and a model generation unit. The first data acquisition unit acquires learning data including a first parameter representing the state of at least one facility and an air conditioning system and a second parameter relating to the intensity of air conditioning of the air conditioning system. The model generation unit generates a trained model that infers the second parameter from the first parameter using the training data. The first parameter relates to the identification information of the worker performing the work in each of at least one equipment, the item of the product produced by at least one equipment, the identification information of at least one equipment, the takt time of the product, and the quality of the product. Includes information and information about the time when the first parameter was acquired.
 本開示の他の局面に係る推論装置は、少なくとも1つの設備を含む工場の空調システムの制御を出力する。推論装置は、データ取得部と、推論部とを備える。データ取得部は、少なくとも1つの設備および空調システムの状態を表す第1パラメータを取得する。推論部は、第1パラメータから空調システムの空調の強度に関する第2パラメータを推論する学習済みモデルを用いて、データ取得部によって取得された第1パラメータから第2パラメータを出力する。第1パラメータは、少なくとも1つの設備の各々において作業を行う作業者の識別情報、少なくとも1つの設備によって生産される製品の品目、少なくとも1つの設備の識別情報、製品のタクトタイム、製品の品質に関する情報、および第1パラメータが取得された時刻に関する情報を含む。 The inference device according to the other aspect of the present disclosure outputs the control of the air conditioning system of the factory including at least one facility. The inference device includes a data acquisition unit and an inference unit. The data acquisition unit acquires a first parameter representing the state of at least one facility and the air conditioning system. The inference unit outputs the second parameter from the first parameter acquired by the data acquisition unit using a learned model that infers the second parameter related to the air conditioning intensity of the air conditioning system from the first parameter. The first parameter relates to the identification information of the worker performing the work in each of at least one equipment, the item of the product produced by at least one equipment, the identification information of at least one equipment, the takt time of the product, and the quality of the product. Includes information and information about the time when the first parameter was acquired.
 本開示に係る学習装置および推論装置によれば、第1パラメータが少なくとも1つの設備の各々において作業を行う作業者の識別情報、少なくとも1つの設備によって生産される製品の品目、少なくとも1つの設備の識別情報、製品のタクトタイム、製品の品質に関する情報、および第1パラメータが取得された時刻に関する情報を含むことにより、工場の作業者の生産性を向上させることができる。 According to the learning device and the inference device according to the present disclosure, the first parameter is the identification information of the worker who works in each of the at least one equipment, the item of the product produced by the at least one equipment, and the at least one equipment. By including the identification information, the takt time of the product, the information on the quality of the product, and the information on the time when the first parameter is acquired, the productivity of the workers in the factory can be improved.
実施の形態に係る学習装置および推論装置を備える管理サーバ、および管理サーバによって制御される空調システムおよび工場の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the management server provided with the learning device and the inference device which concerns on embodiment, the air-conditioning system controlled by the management server, and the factory. 作業時間、設備の識別情報、品目、作業者の識別情報、予想最適温度、および空調強度の一例を示す図である。It is a figure which shows an example of working time, equipment identification information, item, worker identification information, expected optimum temperature, and air-conditioning intensity. 図1の学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the learning apparatus of FIG. 図3の学習装置の学習処理を示すフローチャートである。It is a flowchart which shows the learning process of the learning apparatus of FIG. 図1の推論装置の構成を示すブロック図である。It is a block diagram which shows the structure of the inference apparatus of FIG. 図5の推論装置の推論処理を示すフローチャートである。It is a flowchart which shows the inference processing of the inference apparatus of FIG. 図1の情報処理システムのハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware structure of the information processing system of FIG.
 以下、本開示の実施の形態について、図面を参照しながら詳細に説明する。なお、図中同一または相当部分には同一符号を付してその説明は原則として繰り返さない。 Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. In principle, the same or corresponding parts in the drawings are designated by the same reference numerals and the description is not repeated.
 図1は、実施の形態に係る学習装置100および推論装置200を備える管理サーバ10、および管理サーバ10によって制御される空調システム20および工場30の構成の一例を示すブロック図である。図2は、作業時間、設備の識別情報、品目、作業者の識別情報、予想最適温度、および空調強度の一例を示す図である。図1および図2を参照しながら、工場30は、設備Eq1,Eq2,Eq3とを含む。ワークWrkは、設備Eq1~Eq3の順に作業工程を経由して製品Prdとして出荷される。設備Eq1~Eq3においては、作業者Op1,Op2,Op3が作業を行っている。 FIG. 1 is a block diagram showing an example of the configuration of a management server 10 including a learning device 100 and an inference device 200 according to an embodiment, and an air conditioning system 20 and a factory 30 controlled by the management server 10. FIG. 2 is a diagram showing an example of working hours, equipment identification information, items, worker identification information, expected optimum temperature, and air conditioning intensity. With reference to FIGS. 1 and 2, the factory 30 includes equipment Eq1, Eq2, Eq3. The work Wrk is shipped as a product Prd via a work process in the order of equipment Eq1 to Eq3. In the equipments Eq1 to Eq3, the workers Op1, Op2, and Op3 are performing the work.
 管理サーバ10は、情報処理システム11と、データ収集/処理システム12とを含む。情報処理システム11は、学習装置100と、推論装置200とを含む。管理サーバ10は、温湿度センサSn1,Sn2,Sn3から、設備Eq1の温度および湿度、設備Eq2の温度および湿度、および設備Eq3の温度および湿度を無線通信によってそれぞれ取得する。管理サーバ10は、温湿度センサSn10から室外機21の温度および湿度を空調コントローラ23を介して有線通信によって取得する。管理サーバ10は、温湿度センサSn11から室内機22の温度および湿度をデータ収集/処理システム12を介して有線通信によって取得する。管理サーバ10は、生産現場の空調制御推定パラメータPrm1(第1パラメータ)を取得する。空調制御推定パラメータPrm1は、設備Eq1~Eq3の各々において作業を行う作業者Op1~Op3の識別情報、設備Eq1~Eq3によって生産される製品Prdの品目、設備Eq1~Eq3の識別情報、製品Prdのタクトタイム、製品Prdの品質に関する情報、および空調制御推定パラメータPrm1が取得された時刻に関する情報を含む。製品Prdの品質に関する情報は、たとえば、検査工程において行われる品質検査の結果、あるいは歩留まりに関する情報が含まれる。空調制御推定パラメータPrm1は、作業者Op1~Op3の各々の作業中の画像を含んでもよい。 The management server 10 includes an information processing system 11 and a data collection / processing system 12. The information processing system 11 includes a learning device 100 and an inference device 200. The management server 10 acquires the temperature and humidity of the equipment Eq1, the temperature and humidity of the equipment Eq2, and the temperature and humidity of the equipment Eq3 from the temperature and humidity sensors Sn1, Sn2, and Sn3 by wireless communication, respectively. The management server 10 acquires the temperature and humidity of the outdoor unit 21 from the temperature / humidity sensor Sn10 via the air conditioning controller 23 by wired communication. The management server 10 acquires the temperature and humidity of the indoor unit 22 from the temperature / humidity sensor Sn11 via the data collection / processing system 12 by wired communication. The management server 10 acquires the air conditioning control estimation parameter Prm1 (first parameter) at the production site. The air conditioning control estimation parameter Prm1 is the identification information of the workers Op1 to Op3 who perform the work in each of the equipments Eq1 to Eq3, the item of the product Prd produced by the equipments Eq1 to Eq3, the identification information of the equipments Eq1 to Eq3, and the product Prd. Includes information about takt time, product Prd quality, and time when the air conditioning control estimation parameter Prm1 was acquired. The information on the quality of the product Prd includes, for example, the result of the quality inspection performed in the inspection process or the information on the yield. The air-conditioning control estimation parameter Prm1 may include images of the workers Op1 to Op3 during their respective operations.
 空調システム20は、室外機21と、室内機22と、空調コントローラ23とを含む。室外機21は、工場30の外部に配置されている。室内機22および空調コントローラ23は、工場30内に配置されている。室外機21は、ファン、圧縮機、および熱交換器を含む。室内機22は、ファン、熱交換器および膨張弁を含む。空調コントローラ23は、サーモスタットを含む。空調コントローラ23は、管理サーバ10からの空調強度制御パラメータPrm2(第2パラメータ)を受けて、室外機21および室内機22を制御する。空調強度制御パラメータPrm2は、サーモスタットのON/OFF、圧縮機の回転周波数、ファンの風力、冷媒の蒸発温度、および冷媒の凝縮温度を含む。 The air conditioner system 20 includes an outdoor unit 21, an indoor unit 22, and an air conditioner controller 23. The outdoor unit 21 is arranged outside the factory 30. The indoor unit 22 and the air conditioning controller 23 are arranged in the factory 30. The outdoor unit 21 includes a fan, a compressor, and a heat exchanger. The indoor unit 22 includes a fan, a heat exchanger and an expansion valve. The air conditioning controller 23 includes a thermostat. The air conditioning controller 23 receives the air conditioning intensity control parameter Prm2 (second parameter) from the management server 10 and controls the outdoor unit 21 and the indoor unit 22. The air conditioning intensity control parameter Prm2 includes ON / OFF of the thermostat, the rotation frequency of the compressor, the wind power of the fan, the evaporation temperature of the refrigerant, and the condensation temperature of the refrigerant.
 図3は、図1の学習装置100の構成を示すブロック図である。図3に示されるように、学習装置100は、データ取得部110(第1データ取得部)と、モデル生成部120とを備える。データ取得部110は、および空調制御推定パラメータPrm1および空調強度制御パラメータPrm2を学習用データとして取得する。 FIG. 3 is a block diagram showing the configuration of the learning device 100 of FIG. As shown in FIG. 3, the learning device 100 includes a data acquisition unit 110 (first data acquisition unit) and a model generation unit 120. The data acquisition unit 110 acquires, and the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2 as learning data.
 モデル生成部120は、空調制御推定パラメータPrm1および空調強度制御パラメータPrm2を含む学習用データを用いて、空調強度制御を学習する。すなわち、モデル生成部120は、空調制御推定パラメータPrm1から空調強度制御パラ―メータPrm2を推論する学習済みモデルを生成する。モデル生成部120が用いる学習アルゴリズムは教師あり学習、教師なし学習、あるいは強化学習等の公知のアルゴリズムを用いることができる。以下では、一例として、強化学習(Reinforcement Learning)を適用した場合について説明する。強化学習では、或る環境内におけるエージェント(行動主体)が、現在の状態(環境のパラメータ)を観測し、取るべき行動を決定する。エージェントの行動により環境が動的に変化し、エージェントには環境の変化に応じて報酬が与えられる。エージェントはこれを繰り返し、一連の行動を通じて報酬が最も多く得られる行動方針を学習する。強化学習の代表的な手法として、Q学習(Q-learning)またはTD学習(TD-learning)が知られている。たとえば、Q学習の場合、行動価値関数Q(s,a)の一般的な更新式は以下の式(1)のように表される。 The model generation unit 120 learns the air conditioning intensity control by using the learning data including the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2. That is, the model generation unit 120 generates a learned model that infers the air conditioning intensity control parameter Prm2 from the air conditioning control estimation parameter Prm1. As the learning algorithm used by the model generation unit 120, known algorithms such as supervised learning, unsupervised learning, and reinforcement learning can be used. In the following, as an example, the case where reinforcement learning is applied will be described. In reinforcement learning, an agent (behavior) in a certain environment observes the current state (environmental parameters) and decides the action to be taken. The environment changes dynamically depending on the behavior of the agent, and the agent is rewarded according to the change in the environment. The agent repeats this process and learns the action policy that gives the most reward through a series of actions. Q-learning or TD-learning is known as a typical method of reinforcement learning. For example, if the Q-learning, action value function Q (s t, a t) general update equations for is expressed as the following equation (1).
Figure JPOXMLDOC01-appb-M000001
 
Figure JPOXMLDOC01-appb-M000001
 
 式(1)において、sは時刻tにおける環境の状態を表し、aは時刻tにおける行動を表す。行動aにより、状態はsからst+1に変わる。rt+1は状態の変化によって得られる報酬を表し、γは割引率を表し、αは学習係数を表す。なお、γは0<γ≦1の範囲であり、αは0<α≦1の範囲とする。空調強度制御パラメータPrm2が行動aとなり、生産現場の空調制御推定パラメータPrm1が状態sとなる。エージェントは、式(1)に示される行動価値関数Q(s,a)の更新を繰り返しながら、時刻tの状態sにおける最良の行動aを学習する。 In the formula (1), s t represents the state of the environment at time t, a t represents the behavior in time t. By the action a t, the state is changed to s t + 1 from the s t. rt + 1 represents the reward obtained by changing the state, γ represents the discount rate, and α represents the learning coefficient. Note that γ is in the range of 0 <γ ≦ 1, and α is in the range of 0 <α ≦ 1. Air conditioning intensity control parameters Prm2 the action a t, and the air conditioning control estimation parameters Prm1 of the production site becomes the state s t. Agent, while repeating the update of the action value function shown in equation (1) Q (s, a ), to learn the best action a t in state s t at time t.
 式(1)で表される更新式は、時刻t+1における最も行動価値Q(評価値)の高い行動aのQ値が時刻tにおいて実行された行動aの行動価値Qよりも大きい場合、行動価値Qを大きくする。逆の場合、当該更新式は、行動価値Qを小さくする。換言すれば、時刻tにおける行動aの行動価値Qを、時刻t+1における最良の行動価値に近づけるように、行動価値関数Q(s,a)を更新する。それにより、或る環境における最良の行動価値が、それ以前の環境における行動価値に順次伝播していくようになる。 The update formula represented by the equation (1) has an action value when the Q value of the action a having the highest action value Q (evaluation value) at time t + 1 is larger than the action value Q of the action a executed at time t. Increase Q. In the opposite case, the update formula reduces the action value Q. In other words, the action value function Q (s, a) is updated so that the action value Q of the action a at time t approaches the best action value at time t + 1. As a result, the best behavioral value in a certain environment is sequentially propagated to the behavioral value in the previous environment.
 上記のように、強化学習によって学習済みモデルを生成する場合、モデル生成部120は、報酬計算部121と、関数更新部122とを備えている。報酬計算部121は、空調制御推定パラメータPrm1および空調強度制御パラメータPrm2を用いて報酬を計算する。報酬計算部121は、単位時間当たりに工場30において実際に生産された製品Prdの数(たとえば個/時)を表す生産性の増減に応じて報酬rを計算する。具体的には、報酬計算部121は、工場30の生産性と作業者Op1~Op3の各々の個別の基準生産性の合計との乖離度合い、あるいは工場30の生産性と基準タクトタイムに対応する基準生産性との乖離度合いに応じて報酬rを計算する。たとえば、工場30の生産性が前回よりも増加する場合には報酬rを増大させ(たとえば「1」の報酬を与える。)、他方、工場30の生産性が前回よりも減少する場合には報酬rを低減する(たとえば「-1」の報酬を与える。)。 As described above, when a trained model is generated by reinforcement learning, the model generation unit 120 includes a reward calculation unit 121 and a function update unit 122. The reward calculation unit 121 calculates the reward using the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2. The reward calculation unit 121 calculates the reward r according to the increase or decrease in productivity, which represents the number of product Prds actually produced in the factory 30 (for example, pieces / hour) per unit time. Specifically, the reward calculation unit 121 corresponds to the degree of deviation between the productivity of the factory 30 and the total of the individual standard productivity of each of the workers Op1 to Op3, or the productivity of the factory 30 and the standard tact time. The reward r is calculated according to the degree of deviation from the standard productivity. For example, if the productivity of the factory 30 is higher than the previous time, the reward r is increased (for example, the reward of "1" is given), while if the productivity of the factory 30 is lower than the previous time, the reward is increased. Reduce r (for example, give a reward of "-1").
 関数更新部122は、報酬計算部121によって計算される報酬に従って、空調強度制御パラメータPrm2を決定するための関数を更新し、学習済みモデル記憶部140に出力する。たとえばQ学習の場合、式(1)で表される行動価値関数Q(s,a)が空調強度制御パラメータPrm2を算出するための関数として用いられる。 The function update unit 122 updates the function for determining the air conditioning intensity control parameter Prm2 according to the reward calculated by the reward calculation unit 121, and outputs the function to the trained model storage unit 140. For example, in the case of Q-learning, action value function Q (s t, a t) represented by the formula (1) is used as a function for calculating the air-conditioning power control parameter Prm2.
 学習装置100は、以上のような学習を繰り返し実行する。学習済みモデル記憶部140は、関数更新部122によって更新された行動価値関数Q(s,a)である学習済みモデルを記憶する。 The learning device 100 repeatedly executes the above learning. Learned model storage unit 140, action value is updated by the function updating unit 122 function Q (s t, a t) for storing the learned model is.
 図4は、図3の学習装置100の学習処理を示すフローチャートである。以下ではステップを単にSと記載する。図4に示されるように、S101において、データ取得部110は、空調制御推定パラメータPrm1および空調強度制御パラメータPrm2を学習用データとして取得する。具体的には、データ取得部110は、作業者Op1~Op3の各々の識別情報に当該作業者が作業をしている設備の識別情報、当該作業者に応じた基準タクトタイム、および作業時間を付与し、温度および湿度に当該温度および湿度が測定された位置情報および時間情報を付与する。 FIG. 4 is a flowchart showing the learning process of the learning device 100 of FIG. In the following, the step is simply referred to as S. As shown in FIG. 4, in S101, the data acquisition unit 110 acquires the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2 as learning data. Specifically, the data acquisition unit 110 sets the identification information of the equipment in which the worker is working, the reference tact time according to the worker, and the working time in the identification information of each of the workers Op1 to Op3. It is given, and the position information and the time information at which the temperature and humidity were measured are given to the temperature and humidity.
 S102において、モデル生成部120は、空調制御推定パラメータPrm1および空調強度制御パラメータPrm2を用いて報酬を計算する。具体的には、報酬計算部121は、空調制御推定パラメータPrm1および空調強度制御パラメータPrm2を取得し、予め定められた報酬基準である基準生産性と実際の工場30の生産性との乖離度合に基づいて空調強度制御パラメータPrm2に対応する報酬を増加させるか(S103)または報酬を減じるか(S104)を判断する。報酬計算部121は、実際の工場30の生産性が基準生産性よりも大きい場合、S103において報酬を増大させる。一方、報酬計算部121は、実際の工場30の生産性が基準生産性よりも小さい場合、S104において報酬を減少させる。 In S102, the model generation unit 120 calculates the reward using the air conditioning control estimation parameter Prm1 and the air conditioning intensity control parameter Prm2. Specifically, the reward calculation unit 121 acquires the air-conditioning control estimation parameter Prm1 and the air-conditioning intensity control parameter Prm2, and determines the degree of deviation between the standard productivity, which is a predetermined reward standard, and the actual productivity of the factory 30. Based on this, it is determined whether to increase the reward corresponding to the air conditioning intensity control parameter Prm2 (S103) or decrease the reward (S104). The reward calculation unit 121 increases the reward in S103 when the productivity of the actual factory 30 is larger than the standard productivity. On the other hand, when the productivity of the actual factory 30 is smaller than the standard productivity, the reward calculation unit 121 reduces the reward in S104.
 なお、報酬基準として、製品Prdの歩留まりが基準歩留まりより大きい場合に報酬を増加させ、小さい場合に報酬を減少させるという基準が用いられてもよい。その結果、製品Prdの品質を向上させることができる。 As a reward standard, a standard may be used in which the reward is increased when the yield of the product Prd is larger than the standard yield, and the reward is decreased when the yield is smaller than the standard yield. As a result, the quality of the product Prd can be improved.
 S105において、関数更新部122は、報酬計算部121によって計算された報酬および式(1)を用いて、学習済みモデル記憶部140が記憶する行動価値関数Q(s,a)を更新する。 In S105, the function updater 122 uses the calculated fees and wherein the compensation calculation unit 121 (1), the behavior learned model storage unit 140 stores value function Q (s t, a t) Update ..
 学習装置100は、以上のS101からS105までのステップを繰り返し実行し、生成された行動価値関数Q(s,a)を学習済みモデルとして記憶する。なお、学習装置100においては、学習済みモデルを学習装置100の外部に設けられた学習済みモデル記憶部140に記憶する構成としたが、学習済みモデル記憶部140を学習装置100の内部に形成してもよい。 Learning apparatus 100 repeatedly executes the steps of the above S101 to S105, and stores the generated action-value function Q (s t, a t) as a learned model. In the learning device 100, the learned model is stored in the learned model storage unit 140 provided outside the learning device 100, but the learned model storage unit 140 is formed inside the learning device 100. You may.
 図5は、図1の推論装置200の構成を示すブロック図である。推論装置200は、データ取得部210と、推論部220とを含む。データ取得部210は、空調制御推定パラメータPrm1を取得する。推論部220は、学習済みモデル記憶部140に記憶されている学習済みモデルを利用して空調強度制御パラメータPrm2を推論する。すなわち、学習済みモデルにデータ取得部210が取得した生産現場の空調制御推定パラメータPrm1を入力することで、生産現場の空調制御推定パラメータPrm1に適した空調強度制御パラメータPrm2を推論することができる。なお、実施の形態では、図3のモデル生成部120で学習された学習済みモデルを用いて空調強度制御パラメータPrm2を推論する構成を説明したが、他の環境で学習された学習済みモデルを用いて空調強度制御パラメータを出力するようにしてもよい。 FIG. 5 is a block diagram showing the configuration of the inference device 200 of FIG. The inference device 200 includes a data acquisition unit 210 and an inference unit 220. The data acquisition unit 210 acquires the air conditioning control estimation parameter Prm1. The inference unit 220 infers the air conditioning intensity control parameter Prm2 by using the learned model stored in the learned model storage unit 140. That is, by inputting the air conditioning control estimation parameter Prm1 of the production site acquired by the data acquisition unit 210 into the trained model, it is possible to infer the air conditioning intensity control parameter Prm2 suitable for the air conditioning control estimation parameter Prm1 of the production site. In the embodiment, the configuration for inferring the air conditioning intensity control parameter Prm2 using the trained model learned by the model generation unit 120 in FIG. 3 has been described, but the trained model trained in another environment is used. The air conditioning intensity control parameter may be output.
 図6は、図5の推論装置200の推論処理を示すフローチャートである。図6に示されるように、S201において、データ取得部210は、生産現場の空調制御推定パラメータPrm1を取得する。S202において、推論部220は学習済みモデル記憶部140に記憶された学習済みモデルに生産現場の空調制御推定パラメータPrm1を入力し、空調強度制御パラメータPrm2を得て、S203において空調強度制御パラメータPrm2を空調システム20に出力する。S204において、空調システム20は、推論装置200から出力された空調強度制御パラメータPrm2を用いて、近い未来に予測される生産性変化量を増加させる強度となる空調制御を実施する。これにより、従来の一律の温度設定を用いる空調制御で避けられなかった人、設備、費目、および時間に依存して発生する生産性変動という課題に対し、推定される近い未来の生産性を向上させる空調制御を実施し、安定かつ高い生産性を維持することができる。 FIG. 6 is a flowchart showing the inference process of the inference device 200 of FIG. As shown in FIG. 6, in S201, the data acquisition unit 210 acquires the air conditioning control estimation parameter Prm1 at the production site. In S202, the inference unit 220 inputs the air conditioning control estimation parameter Prm1 at the production site into the learned model stored in the learned model storage unit 140, obtains the air conditioning intensity control parameter Prm2, and sets the air conditioning intensity control parameter Prm2 in S203. Output to the air conditioning system 20. In S204, the air conditioning system 20 uses the air conditioning intensity control parameter Prm2 output from the inference device 200 to perform air conditioning control having an intensity that increases the amount of productivity change predicted in the near future. This will improve the estimated productivity in the near future to solve the problem of productivity fluctuation that occurs depending on people, equipment, cost items, and time, which was unavoidable in the conventional air conditioning control using uniform temperature setting. It is possible to maintain stable and high productivity by implementing air conditioning control.
 なお、本実施の形態では、推論部が用いる学習アルゴリズムに強化学習を適用した場合について説明したが、学習アルゴリズムは強化学習に限られるものではない。学習アルゴリズムについては、強化学習以外にも、教師あり学習、教師なし学習、または半教師あり学習等を適用することも可能である。 In the present embodiment, the case where reinforcement learning is applied to the learning algorithm used by the inference unit has been described, but the learning algorithm is not limited to reinforcement learning. As for the learning algorithm, in addition to reinforcement learning, supervised learning, unsupervised learning, semi-supervised learning, and the like can also be applied.
 また、モデル生成部120に用いられる学習アルゴリズムとしては、特徴量そのものの抽出を学習する、深層学習(Deep Learning)を用いることもでき、他の公知の方法、たとえばニューラルネットワーク、遺伝的プログラミング、機能論理プログラミング、もしくはサポートベクターマシンなどに従って機械学習が実行されてもよい。 Further, as a learning algorithm used in the model generation unit 120, deep learning, which learns the extraction of the feature amount itself, can also be used, and other known methods such as neural networks, genetic programming, and functions can be used. Machine learning may be performed according to logical programming or a support vector machine.
 なお、学習装置100および推論装置200は、たとえば、ネットワークを介して空調システム20に接続される、空調システム20とは別個の装置であってもよい。また、学習装置100および推論装置200は、空調システム20に内蔵されていてもよい。さらに、学習装置100および推論装置200は、クラウドサーバ上に存在していてもよい。 The learning device 100 and the inference device 200 may be devices separate from the air conditioning system 20 that are connected to the air conditioning system 20 via a network, for example. Further, the learning device 100 and the inference device 200 may be built in the air conditioning system 20. Further, the learning device 100 and the inference device 200 may exist on the cloud server.
 また、作業者毎のデータを直接取得するのではなく、作業者のペルソナを、年齢、熟練度、および性別(たとえば20歳代の新人の男性)と複数の観点から設定し、ペルソナ毎の作業者モデルを設定することで、作業者データを簡略化してもよい。同様に工場、設備、ラインを複数のモデルとして予め準備することにより、空調制御推定パラメータPrm1のデータ構成を簡素化してもよい。 In addition, instead of directly acquiring data for each worker, the persona of the worker is set from multiple viewpoints such as age, skill level, and gender (for example, a new male in his 20s), and the work for each persona is performed. Worker data may be simplified by setting a person model. Similarly, the data configuration of the air conditioning control estimation parameter Prm1 may be simplified by preparing a plurality of models of factories, equipment, and lines in advance.
 また、モデル生成部120は、複数の空調システム20から取得される学習用データを用いて、空調強度制御を学習するようにしてもよい。なお、モデル生成部120は、同一のエリアで使用される複数の空調システム20から学習用データを取得してもよいし、異なるエリアで独立して動作する複数の空調システム20から収集される学習用データを利用して空調強度制御を学習してもよい。また、学習用データを収集する空調システム20を途中で学習対象に追加したり、学習対象から除去することも可能である。さらに、或る空調システム20に関して空調強度制御を学習した学習装置100を、これとは別の空調システム20に適用し、当該別の先回り空調システムに関して空調強度制御を再学習して更新するようにしてもよい。 Further, the model generation unit 120 may learn the air conditioning intensity control by using the learning data acquired from the plurality of air conditioning systems 20. The model generation unit 120 may acquire learning data from a plurality of air conditioning systems 20 used in the same area, or may collect learning data from a plurality of air conditioning systems 20 operating independently in different areas. The air conditioning intensity control may be learned by using the data. Further, the air conditioning system 20 that collects learning data can be added to the learning target or removed from the learning target on the way. Further, the learning device 100 that has learned the air conditioning intensity control for a certain air conditioning system 20 is applied to another air conditioning system 20, and the air conditioning intensity control is relearned and updated for the other advanced air conditioning system. You may.
 図7は、図1の情報処理システム11のハードウェア構成を示すブロック図である。図7に示されるように、情報処理システム11は、処理回路51と、メモリ52(記憶部)と、入出力部53とを含む。処理回路51は、メモリ52に格納されるプログラムを実行するCPU(Central Processing Unit)を含む。処理回路51は、GPU(Graphics Processing Unit)を含んでもよい。情報処理システム11の機能は、ソフトウェア、ファームウェア、またはソフトウェアとファームウェアとの組み合わせにより実現される。ソフトウェアあるいはファームウェアはプログラムとして記述され、メモリ52に格納される。処理回路51は、メモリ52に記憶されたプログラムを読み出して実行する。なお、CPUは、中央処理装置、処理装置、演算装置、マイクロプロセッサ、マイクロコンピュータ、プロセッサ、あるいはDSP(Digital Signal Processor)とも呼ばれる。 FIG. 7 is a block diagram showing a hardware configuration of the information processing system 11 of FIG. As shown in FIG. 7, the information processing system 11 includes a processing circuit 51, a memory 52 (storage unit), and an input / output unit 53. The processing circuit 51 includes a CPU (Central Processing Unit) that executes a program stored in the memory 52. The processing circuit 51 may include a GPU (Graphics Processing Unit). The function of the information processing system 11 is realized by software, firmware, or a combination of software and firmware. The software or firmware is described as a program and stored in the memory 52. The processing circuit 51 reads and executes the program stored in the memory 52. The CPU is also called a central processing unit, a processing unit, an arithmetic unit, a microprocessor, a microcomputer, a processor, or a DSP (Digital Signal Processor).
 メモリ52には、不揮発性または揮発性の半導体メモリ(たとえばRAM(Random Access Memory)、ROM(Read Only Memory)、フラッシュメモリ、EPROM(Erasable Programmable Read Only Memory)、あるいはEEPROM(Electrically Erasable Programmable Read Only Memory))、および磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、あるいはDVD(Digital Versatile Disc)が含まれる。メモリ52には、たとえば、学習済みモデル、空調プログラム、および機械学習プログラムが保存される。 The memory 52 includes a non-volatile or volatile semiconductor memory (for example, RAM (Random Access Memory), ROM (Read Only Memory), flash memory, EPROM (Erasable Programmable Read Only Memory), or EPROM (Electrically Erasable Programmable Read Only Memory). )), And includes magnetic discs, flexible discs, optical discs, compact discs, mini discs, or DVDs (Digital Versatile Discs). The memory 52 stores, for example, a trained model, an air conditioning program, and a machine learning program.
 入出力部53は、ユーザからの操作を受けるとともに、処理結果をユーザに出力する。入出力部53は、たとえば、マウス、キーボード、タッチパネル、ディスプレイ、およびスピーカを含む。 The input / output unit 53 receives an operation from the user and outputs the processing result to the user. The input / output unit 53 includes, for example, a mouse, a keyboard, a touch panel, a display, and a speaker.
 以上、実施の形態に係る学習装置および推論装置によれば、工場の作業者の生産性を向上させることができる。 As described above, according to the learning device and the inference device according to the embodiment, the productivity of factory workers can be improved.
 今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本開示の範囲は上記した説明ではなくて請求の範囲によって示され、請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiments disclosed this time should be considered to be exemplary in all respects and not restrictive. The scope of the present disclosure is indicated by the scope of claims rather than the above description, and is intended to include all modifications within the meaning and scope of the claims.
 10 管理サーバ、11 情報処理システム、12 データ収集/処理システム、20 空調システム、21 室外機、22 室内機、23 空調コントローラ、30 工場、51 処理回路、52 メモリ、53 入出力部、100 学習装置、110,210 データ取得部、120 モデル生成部、121 報酬計算部、122 関数更新部、140 学習済みモデル記憶部、200 推論装置、220 推論部、Eq1~Eq3 設備、Op1~Op3 作業者、Prd 製品、Sn1~Sn3,Sn10,Sn11 温湿度センサ、Wrk ワーク。 10 management server, 11 information processing system, 12 data collection / processing system, 20 air conditioning system, 21 outdoor unit, 22 indoor unit, 23 air conditioning controller, 30 factory, 51 processing circuit, 52 memory, 53 input / output unit, 100 learning device , 110, 210 data acquisition unit, 120 model generation unit, 121 reward calculation unit, 122 function update unit, 140 learned model storage unit, 200 inference device, 220 inference unit, Eq1 to Eq3 equipment, Op1 to Op3 workers, Prd Product, Sn1 to Sn3, Sn10, Sn11 temperature and humidity sensor, Wrk work.

Claims (9)

  1.  少なくとも1つの設備を含む工場の空調システムの制御を学習する学習装置であって、
     前記少なくとも1つの設備および前記空調システムの状態を表す第1パラメータと、前記空調システムの空調の強度に関する第2パラメータとを含む学習用データを取得する第1データ取得部と、
     前記学習用データを用いて、前記第1パラメータから、前記第2パラメータを推論する学習済みモデルを生成するモデル生成部とを備え、
     前記第1パラメータは、前記少なくとも1つの設備の各々において作業を行う作業者の識別情報、前記少なくとも1つの設備によって生産される製品の品目、前記少なくとも1つの設備の識別情報、前記製品のタクトタイム、前記製品の品質に関する情報、および前記第1パラメータが取得された時刻に関する情報を含む、学習装置。
    A learning device that learns to control a factory air conditioning system that includes at least one piece of equipment.
    A first data acquisition unit that acquires learning data including a first parameter representing the state of the at least one facility and the air conditioning system, and a second parameter relating to the intensity of air conditioning of the air conditioning system.
    It is provided with a model generation unit that generates a trained model that infers the second parameter from the first parameter using the training data.
    The first parameter is identification information of a worker who works in each of the at least one equipment, an item of a product produced by the at least one equipment, identification information of the at least one equipment, and a takt time of the product. , A learning device that includes information about the quality of the product, and information about the time when the first parameter was acquired.
  2.  前記第1パラメータは、前記作業者の作業中の画像を含む、請求項1に記載の学習装置。 The learning device according to claim 1, wherein the first parameter includes an image of the worker during work.
  3.  前記学習済みモデルは、前記第1パラメータと、前記第2パラメータの評価値とが関連付けられた関数を含む、請求項1または2に記載の学習装置。 The learning device according to claim 1 or 2, wherein the trained model includes a function in which the first parameter and the evaluation value of the second parameter are associated with each other.
  4.  前記モデル生成部は、前記第2パラメータに従って制御された前記空調システムによる空調の下での前記工場内の生産性と基準生産性との乖離度合いに応じて前記第2パラメータの評価値を更新する、請求項3に記載の学習装置。 The model generation unit updates the evaluation value of the second parameter according to the degree of deviation between the productivity in the factory and the reference productivity under the air conditioning by the air conditioning system controlled according to the second parameter. , The learning device according to claim 3.
  5.  前記モデル生成部は、前記第2パラメータに従って制御された前記空調システムによる空調の下で生産された前記製品の歩留まりの変化に応じて前記第2パラメータの評価値を更新する、請求項3に記載の学習装置。 The third aspect of the present invention, wherein the model generation unit updates the evaluation value of the second parameter according to a change in the yield of the product produced under the air conditioning by the air conditioning system controlled according to the second parameter. Learning device.
  6.  前記第1パラメータを取得する第2データ取得部と、
     請求項1~5のいずれか1項に記載の学習装置によって生成された前記学習済みモデルを用いて、前記第2データ取得部によって取得された前記第1パラメータから前記第2パラメータを出力する推論部とを備える、推論装置。
    A second data acquisition unit that acquires the first parameter, and
    Inference that outputs the second parameter from the first parameter acquired by the second data acquisition unit using the trained model generated by the learning device according to any one of claims 1 to 5. An inference device equipped with a unit.
  7.  少なくとも1つの設備を含む工場の空調システムの制御を出力する推論装置であって、
     前記少なくとも1つの設備および前記空調システムの状態を表す第1パラメータを取得するデータ取得部と、
     前記第1パラメータから前記空調システムの空調の強度に関する第2パラメータを推論する学習済みモデルを用いて、前記データ取得部によって取得された前記第1パラメータから前記第2パラメータを出力する推論部とを備え、
     前記第1パラメータは、前記少なくとも1つの設備の各々において作業を行う作業者の識別情報、前記少なくとも1つの設備によって生産される製品の品目、前記少なくとも1つの設備の識別情報、前記製品のタクトタイム、前記製品の品質に関する情報、および前記第1パラメータが取得された時刻に関する情報を含む、推論装置。
    An inference device that outputs control of a factory air conditioning system that includes at least one piece of equipment.
    A data acquisition unit that acquires a first parameter representing the state of at least one of the facilities and the air conditioning system.
    Using a learned model that infers a second parameter related to the intensity of air conditioning of the air conditioning system from the first parameter, an inference unit that outputs the second parameter from the first parameter acquired by the data acquisition unit is used. Prepare,
    The first parameter is identification information of a worker who works in each of the at least one equipment, an item of a product produced by the at least one equipment, identification information of the at least one equipment, and a takt time of the product. An inference device that includes information about the quality of the product, and information about the time when the first parameter was acquired.
  8.  前記第1パラメータは、前記作業者の作業中の画像を含む、請求項7に記載の推論装置。 The inference device according to claim 7, wherein the first parameter includes an image of the worker in progress.
  9.  前記学習済みモデルは、前記第1パラメータと、前記第2パラメータの評価値とが関連付けられた関数を含む、請求項7または8に記載の推論装置。 The inference device according to claim 7 or 8, wherein the trained model includes a function in which the first parameter and the evaluation value of the second parameter are associated with each other.
PCT/JP2020/014248 2020-03-27 2020-03-27 Learning device and inference device for air-conditioning control WO2021192280A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202080095692.6A CN115280077B (en) 2020-03-27 2020-03-27 Learning device and reasoning device for air conditioner control
JP2022510379A JP7414964B2 (en) 2020-03-27 2020-03-27 Air conditioning control learning device and reasoning device
PCT/JP2020/014248 WO2021192280A1 (en) 2020-03-27 2020-03-27 Learning device and inference device for air-conditioning control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/014248 WO2021192280A1 (en) 2020-03-27 2020-03-27 Learning device and inference device for air-conditioning control

Publications (1)

Publication Number Publication Date
WO2021192280A1 true WO2021192280A1 (en) 2021-09-30

Family

ID=77890005

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/014248 WO2021192280A1 (en) 2020-03-27 2020-03-27 Learning device and inference device for air-conditioning control

Country Status (3)

Country Link
JP (1) JP7414964B2 (en)
CN (1) CN115280077B (en)
WO (1) WO2021192280A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017142595A (en) * 2016-02-09 2017-08-17 ファナック株式会社 Production control system and integrated production control system
CN109695944A (en) * 2018-11-29 2019-04-30 中国汽车工业工程有限公司 A kind of control method of the coating fresh air conditioner based on multi-model deep learning
WO2019087538A1 (en) * 2017-10-30 2019-05-09 ダイキン工業株式会社 Concentration estimation device

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4059014B2 (en) * 2001-06-19 2008-03-12 富士電機システムズ株式会社 Optimal plant operation method and optimal plant design method
JP5572799B2 (en) * 2010-04-01 2014-08-20 三菱電機株式会社 Air conditioning system controller
JP6289749B2 (en) * 2015-05-18 2018-03-07 三菱電機株式会社 Indoor environment model creation device
WO2017098552A1 (en) 2015-12-07 2017-06-15 三菱電機株式会社 Control device, air-conditioning system, and control method for air conditioners
WO2018084577A1 (en) * 2016-11-03 2018-05-11 Samsung Electronics Co., Ltd. Data recognition model construction apparatus and method for constructing data recognition model thereof, and data recognition apparatus and method for recognizing data thereof
JP6457472B2 (en) * 2016-12-14 2019-01-23 ファナック株式会社 Control system and machine learning device
JP2019066135A (en) * 2017-10-04 2019-04-25 ファナック株式会社 Air-conditioning control system
CN108050676B (en) * 2017-10-30 2019-12-10 珠海格力电器股份有限公司 Air conditioner and control method, device and system thereof
KR102071960B1 (en) * 2018-01-18 2020-01-31 엘지전자 주식회사 Control device of configuring parameter based on artificial intelligent learning of space where the air-conditioner is installed
JP2019215109A (en) 2018-06-11 2019-12-19 ダイキン工業株式会社 Air conditioning system
KR102051011B1 (en) * 2018-11-27 2019-12-02 오아 주식회사 Server and method for controlling learning-based speech recognition apparatus
CN110726222B (en) * 2019-10-29 2020-12-29 珠海格力电器股份有限公司 Air conditioner control method and device, storage medium and processor
JPWO2021192279A1 (en) 2020-03-27 2021-09-30

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017142595A (en) * 2016-02-09 2017-08-17 ファナック株式会社 Production control system and integrated production control system
WO2019087538A1 (en) * 2017-10-30 2019-05-09 ダイキン工業株式会社 Concentration estimation device
CN109695944A (en) * 2018-11-29 2019-04-30 中国汽车工业工程有限公司 A kind of control method of the coating fresh air conditioner based on multi-model deep learning

Also Published As

Publication number Publication date
CN115280077B (en) 2024-03-08
JPWO2021192280A1 (en) 2021-09-30
CN115280077A (en) 2022-11-01
JP7414964B2 (en) 2024-01-16

Similar Documents

Publication Publication Date Title
US10969133B2 (en) Methodology of occupant comfort management in buildings using occupant comfort models and user interfaces thereof
CN111247375B (en) Air conditioner control device
WO2021192279A1 (en) Learning device and inference device for air-conditioning control
JP6059375B1 (en) Production control system and integrated production control system
US5751916A (en) Building management system having set offset value learning and set bias value determining system for controlling thermal environment
JP2019522163A (en) Controller for operating air conditioning system and method for controlling air conditioning system
CN111279135B (en) Concentration ratio estimation device
JP6897767B2 (en) Set value calculation system, method and program
JP2023166622A (en) Air conditioning control device
US9952574B2 (en) Machine learning device, motor control system, and machine learning method for learning cleaning interval of fan motor
US20220178572A1 (en) Air conditioning control system and air conditioning control method
US20220154960A1 (en) Air-conditioning control device, air-conditioning system, air-conditioning control method, and non-transitory computer readable recording medium
WO2021192280A1 (en) Learning device and inference device for air-conditioning control
Homod et al. Deep clustering of Lagrangian trajectory for multi-task learning to energy saving in intelligent buildings using cooperative multi-agent
JP5213749B2 (en) Model function update processing apparatus and method
JP7305041B2 (en) Information processing equipment and air conditioning system
JPH085126A (en) Device and method to control air-conditioning machine operation
JP5717950B2 (en) Model function processing apparatus and method
CN113310176B (en) Information processing apparatus
WO2022101989A1 (en) Air conditioning device, and learning device of air conditioning device
JPH06323595A (en) Operation controller for air conditioner
Karpenko et al. Control indoor climate
WO2023132266A1 (en) Learning device, air conditioning control system, inference device, air conditioning control device, trained model generation method, trained model, and program
JPH05223323A (en) Controller for air conditioner
CN115950080A (en) Heating ventilation air conditioner regulation and control method and device based on reinforcement learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20926841

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022510379

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20926841

Country of ref document: EP

Kind code of ref document: A1