US20220178572A1

US20220178572A1 - Air conditioning control system and air conditioning control method

Info

Publication number: US20220178572A1
Application number: US17/437,393
Authority: US
Inventors: Shouta TANAKA; Takeshi MORINIBU; Tomohiro Noda
Original assignee: Daikin Industries Ltd
Current assignee: Daikin Industries Ltd
Priority date: 2019-03-13
Filing date: 2020-03-06
Publication date: 2022-06-09
Also published as: EP3940306A4; JP2020148385A; JP7071307B2; WO2020184454A1; EP3940306A1; CN113544439A

Abstract

An air conditioning control system controls an air conditioning apparatus that performs an air conditioning operation in a target space. The air conditioning control system includes a detecting unit that detects a temperature distribution of the target space, and a server. The server controls the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, based on the temperature distribution detected by the detecting unit by using a learning device that has learned by deep reinforcement learning.

Description

TECHNICAL FIELD

An air conditioning control system for controlling an air conditioning apparatus and an air conditioning control method for the same.

BACKGROUND ART

According to PTL 1 (Japanese Unexamined Patent Application Publication No. 2012-184899), air conditioning improvement is performed by an air conditioning operation based on a so-called rule base to approach a target temperature distribution.

SUMMARY OF THE INVENTION

Technical Problem

In an air conditioning operation based on a rule base, a long time is taken to approach a target temperature distribution.

Solution to Problem

An air conditioning control system according to a first aspect is a system for controlling an air conditioning apparatus that performs an air conditioning operation in a target space. The air conditioning control system includes a detecting unit and a server. The detecting unit detects a temperature distribution of the target space. The server controls the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, on the basis of the temperature distribution detected by the detecting unit, by using a learning device that has learned by deep reinforcement learning.
The learning device performs learning, by deep reinforcement learning, to cause the temperature distribution of the target space to efficiently approach the target temperature distribution. Accordingly, the temperature distribution of the target space can be caused to efficiently approach the target temperature distribution in a short time.
An air conditioning control system according to a second aspect is the system according to the first aspect, in which the detecting unit includes an infrared sensor that detects a thermal image of the target space. The detecting unit detects the temperature distribution of the target space on the basis of the thermal image detected by the infrared sensor.
An air conditioning control system according to a third aspect is the system according to the first aspect or the second aspect, in which the learning device is updated on the basis of a temperature distribution of the target space obtained after the server has controlled the air conditioning apparatus. The infrared sensor detects a new thermal image of the target space after the server has controlled the air conditioning apparatus. The detecting unit detects a new temperature distribution of the target space on the basis of the new thermal image. The server controls the air conditioning apparatus such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
An air conditioning control system according to a fourth aspect is the system according to any one of the first aspect to the third aspect, in which the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations. The learning device has learned by the deep reinforcement learning on the basis of a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.
An air conditioning control system according to a fifth aspect is the system according to any one of the first aspect to the fourth aspect, in which the server controls the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
An air conditioning control system according to a sixth aspect is the system according to any one of the first aspect to the fifth aspect, in which the infrared sensor is built in the air conditioning apparatus or is disposed in the target space.
An air conditioning control system according to a seventh aspect is the system according to any one of the first aspect to the sixth aspect, in which the server controls at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
An air conditioning control method according to an eighth aspect is a method for controlling an air conditioning apparatus that performs an air conditioning operation in a target space. The air conditioning control method includes a step of detecting a thermal image of the target space, a step of detecting a temperature distribution of the target space on the basis of the thermal image, and a step of controlling the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, on the basis of the temperature distribution by using a learning device that has learned by deep reinforcement learning.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of an air conditioning control system.

FIG. 2 is a block diagram illustrating the configuration of the air conditioning control system.

FIG. 3 is a schematic diagram of a memory database.

FIG. 4 is a block diagram illustrating the configuration of functional units of the air conditioning control system.

FIG. 5 is a flowchart illustrating a process performed by the entire air conditioning control system.

FIG. 6 is a flowchart illustrating a process performed by the functional units of the air conditioning control system.

DESCRIPTION OF EMBODIMENTS

Hereinafter, an air conditioning control system 1 according to an embodiment of the present disclosure will be described. The following embodiment is a specific example, does not limit the technical scope, and can be appropriately changed without deviating from the gist.
(1) Overall Configuration
As illustrated in FIG. 1, the air conditioning control system 1 includes an air conditioning apparatus 100, a detecting unit 200, and a server 300.
An indoor unit of the air conditioning apparatus 100 and the detecting unit 200 are installed in a target space α in which the air conditioning apparatus 100 performs an air conditioning operation, such as an office or a residence. The air conditioning operation includes cooling, heating, dehumidifying, and the like. The target space α includes a wall α1, a floor α2, an obstacle α3, and so forth. The server 300 is installed in a management room or the like outside the target space α.
The air conditioning apparatus 100 and the detecting unit 200 are connected to each other via a wireless or wired network N. The air conditioning apparatus 100 and the server 300 are connected to each other via the wireless or wired network N.
(2) Detailed Configuration
(2-1) Air Conditioning Apparatus 100
The air conditioning apparatus 100 illustrated in FIG. 2 performs an air conditioning operation in the target space α. The air conditioning apparatus 100 mainly includes a control unit 101, a compressor 102, a louver 103, and a fan 104. The individual components of the air conditioning apparatus 100 may be disposed in either of the indoor unit installed in the target space α and an outdoor unit installed outside the target space α. The air conditioning apparatus 100 may include a component other than these components.
The control unit 101 controls an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction. The control instruction is received from the server 300 via the network N. The control instruction includes requests or the like for changing an operation mode, a set temperature, an air direction, an air volume, and the like. The control unit 101 performs control of, for example, controlling an output or the like of the compressor 102 to change a set temperature, changing the direction of the louver 103 to change an air direction, controlling an output of the fan 104 to change an air volume, or the like, on the basis of the control instruction.
(2-2) Detecting Unit 200
In the present embodiment, the detecting unit 200 is built in the air conditioning apparatus 100 and is attached to a front surface of the air conditioning apparatus 100, as illustrated in FIG. 1. Alternatively, the detecting unit 200 may be attached to the wall α1, the ceiling, or the like in the target space α.
The detecting unit 200 detects a temperature distribution of the target space α on the basis of a thermal image. The detecting unit 200 includes an infrared sensor 201. The infrared sensor 201 acquires a two-dimensional thermal image. The infrared sensor 201 includes, for example, a pixel group arranged in a two-dimensional matrix, and has a structure capable of acquiring a plurality of two-dimensional thermal images at one time. Alternatively, for example, the infrared sensor 201 may include a pixel group (line sensor) arranged one-dimensionally and may have a structure of one-dimensionally scanning the pixel group to acquire a two-dimensional thermal image. Alternatively, the infrared sensor 201 may include one or more pixels and may have a structure of two-dimensionally scanning the one or more pixels to acquire a two-dimensional thermal image. Here, the configuration of the infrared sensor 201 is not limited.
A thermal image is displayed in such a manner that a portion (pixel) having a higher temperature in the target space α has a higher density. Accordingly, it is possible to acquire a temperature distribution of the target space α and determine the temperature distribution of the target space α. Display of a thermal image is not limited thereto.
The temperature distribution and the target temperature distribution herein are based on a value obtained through statistical processing. Specifically, for example, it is obvious that, in a uniformized temperature distribution of the target space α, the wall α1, the floor α2, the obstacle α3, and so forth in the target space α do not need to have an identical temperature.
(2-3) Server 300
The server 300 holds, as a learning device, a deep neural network (DNN) which is a neural network including an input layer, a plurality of intermediate layers, and an output layer. The server 300 determines a control instruction to be transmitted to the air conditioning apparatus 100 on the basis of information that is output from the output layer in response to input of input information to the input layer. The server 300 transmits the control instruction to the air conditioning apparatus 100 via the network N to control the air conditioning apparatus 100.
In FIG. 1 and FIG. 2, the server 300 is illustrated as a single apparatus. However, it is preferable that the server 300 be compatible with cloud computing. Thus, the hardware configuration of the server 300 need not be accommodated in a single housing or provided as a single apparatus. For example, the server 300 is configured as a result of hardware resources of the server 300 being dynamically connected or disconnected in accordance with a load.
The server 300 includes a processor 301, a memory 302, and auxiliary storage 307. These components are connected to each other via a bus. The memory 302 and the auxiliary storage 307 are examples of a storage device.
The processor 301 executes various calculation processes while referring to the memory 302. The memory 302 stores an experience acquisition program 303, an air conditioning control program 304, a neural network program 305, and a learning program 306. The learning device included in the server 300 executes these various programs to learn and to be updated. The learning device according to the present embodiment learns, by deep reinforcement learning, a control instruction for making a temperature distribution of the target space α a target temperature distribution by using simplest possible air conditioning control. The target temperature distribution is a temperature distribution that is set in advance by a user who uses the air conditioning apparatus 100 or a manager who manages the server 300. The target temperature distribution is, for example, a uniformized temperature distribution or a temperature distribution including temperature variations.
The experience acquisition program 303 is a program for acquiring an experience that is acquired by providing a control instruction to the air conditioning apparatus 100 by the server 300. An experience is represented by, for example, a temperature distribution of the target space α acquired by the detecting unit 200 (state), a control instruction (action), a reward, or a temperature distribution of the target space α acquired by the detecting unit 200 after the air conditioning apparatus 100 has performed an air conditioning operation on the basis of a control instruction (result). The experience acquired by the experience acquisition program 303 is stored in a memory database 308.
The air conditioning control program 304 is a program for determining a control instruction of controlling an air conditioning operation performed by the air conditioning apparatus 100. The control instruction is defined in accordance with the ability, specifications, or a purpose of learning of the air conditioning apparatus 100. For example, when a purpose of learning is causing the learning device to learn to transmit, to the air conditioning apparatus 100, a control instruction to cause a temperature distribution of the target space α to approach a uniformized temperature distribution, the control instruction includes instructions to change an operation mode, a set temperature, an air direction, and an air volume. The ranges of these control instructions are determined in accordance with the ability and specifications of the air conditioning apparatus 100.
In the present embodiment, the neural network program 305 has an input which is a temperature distribution of the target space α, and an output which is a Q value (action evaluation value) of a control instruction to be transmitted to the air conditioning apparatus 100. The neural network is an evaluation model (or an evaluation function) that determines an evaluation value and the parameters thereof are updated by the learning program 306 as appropriate. The air conditioning control system 1 disclosed below is a system trained by deep reinforcement learning, in which an action evaluation model is represented by a deep neural network. The neural network program 305 can be edited and is customized in accordance with a system to which the program is applied. For example, in the present embodiment, the neural network program 305 first processes a temperature distribution detected by the detecting unit 200 by using convolution and pooling, and extracts features. Furthermore, the neural network program 305 couples the extracted features to a long short-term memory (LSTM), adds a time-series influence, and outputs a result as a Q value.
The learning program 306 updates and optimizes the parameters of the neural network. The learning program 306 optimizes the parameters of the neural network by advantage learning, for example. Accordingly, the neural network becomes capable of more accurately estimating Q values of individual actions in a given state, and the server 300 is capable of determining a more intellectual control instruction.
The auxiliary storage 307 stores the memory database 308 and a neural network database 309.
FIG. 3 is a schematic diagram of the memory database 308 according to the present embodiment. The memory database 308 has a limited capacity. The capacity is determined in advance by an engineer or the like. When the memory database 308 becomes full, the first experience in the memory database 308 is deleted and a vacant space for a new experience is formed. The memory database 308 has columns of an index 318, a state 328, an action 338, a reward 348, and a result 358. The memory database 308 may have any structure as long as information on experiences can be stored therein.
The index 318 indicates integers, which indicate the order of experiences stored in the memory database 308. The index 318 indicates which of the experiences stored in the memory database 308 is the oldest and is to be deleted when the memory database 308 is full and a new experience is to be stored.
The state 328 includes information about the target space α, that is, information on a temperature distribution acquired from a thermal image. The state 328 may include, for example, values related to the target space α acquired by various sensors of the air conditioning apparatus 100.
The action 338 indicates positive numerical values. Each number, which is a control instruction transmittable to the air conditioning apparatus 100 by the server 300 and which indicates an action ID, indicates one specific action.
The reward 348 indicates numerical values each defining a reward that can be acquired after the state of the target space α has been changed in response to an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction transmitted by the server 300. For example, when a control instruction to change the direction of the louver 103 is provided to the air conditioning apparatus 100 and a resulting temperature distribution becomes apart from a target temperature distribution, an acquired reward is a negative value. On the other hand, when the foregoing instruction is provided and a resulting temperature distribution approaches a target temperature distribution, an acquired reward is a positive value. Rewards for individual actions in individual states are set in advance.
The result 358 is a transition state after an action (an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction) has been made in a state. Regarding this result, it is defined whether an executed control instruction is able to acquire a reward.
The neural network database 309 includes weights and biases of links between nodes in the neural network. A node transmits information to another node by using a weight and a bias. As a result of optimizing the weights and biases by advantage learning, the neural network database 309 is updated so that the neural network is capable of more accurately estimating Q values for individual actions.
FIG. 4 is a block diagram illustrating functional units of the server 300 according to Embodiment 1.
An experience acquisition unit 313 is implemented by execution of the experience acquisition program 303 by the processor 301. The experience acquisition unit 313 learns how the environment of the target space α is changed as a result of an air conditioning operation performed by the air conditioning apparatus 100 in response to a control instruction transmitted by the server 300. A state, an action, a reward, and a result are merged into an experience, which is transmitted to the memory database 308. The experience acquisition unit 313 receives a control instruction to be transmitted to the air conditioning apparatus 100 from an air conditioning control unit 314 so as to cause the state of the target space α to approach a target temperature distribution.
The air conditioning control unit 314 determines a control instruction to be transmitted to the air conditioning apparatus 100. The air conditioning control unit 314 is implemented by execution of the air conditioning control program 304 by the processor 301. The air conditioning control unit 314 receives, from the experience acquisition unit 313, a temperature distribution of the target space α acquired by the detecting unit 200 as a state, transmits the state to a neural network unit 315, and acquires Q values for individual actions that can be made.
The air conditioning control unit 314 may or may not use Q value information to determine an action. The air conditioning control unit 314 has a parameter called epsilon (ε) and determines, on the basis of the parameter, whether to use a Q value or search for a random action (ε-greedy method). The parameter ε is set to a fixed value by a developer in advance, or is decreased from 1 to 0 in proportional to a training time. The air conditioning control unit 314 randomly selects a number, compares the number with the ε value, and determines which of use of a Q value and search for a random action is to be selected. The air conditioning control unit 314 transmits the determined action to the experience acquisition unit 313. The method of determining an action is not limited thereto.
A learning unit 316 optimizes neural network parameters so that the neural network is capable of more accurately estimating Q values for individual actions when an input is a current state of the target space α. The learning unit 316 is implemented by execution of the learning program 306 by the processor 301.
(3) Process of Air Conditioning Control System 1
FIG. 5 is a flowchart illustrating a procedure of a process performed by the entire air conditioning control system 1 according to the present embodiment.
First, in step 401, the infrared sensor 201 acquires a thermal image of the target space α.
In step 402, the detecting unit 200 detects a temperature distribution of the target space α on the basis of the thermal image acquired by the infrared sensor 201. The detected temperature distribution is transmitted to the server 300 via the network N.
In step 403, the server 300 determines a control instruction to be transmitted to the air conditioning apparatus 100 also on the basis of the received temperature distribution. Here, the server 300 determines a control instruction capable of achieving a target temperature distribution with simplest possible control by using the learning device. The server 300 transmits the determined control instruction to the air conditioning apparatus 100.
In step 404, the control unit 101 of the air conditioning apparatus 100 controls the individual devices of the air conditioning apparatus 100 so as to perform an air conditioning operation on the basis of the received control instruction.
In step 405, the detecting unit 200 detects a temperature distribution of the target space α in a state changed by the air conditioning operation performed by the air conditioning apparatus 100. The detected temperature distribution is transmitted to the server 300.
In step 406, the learning device of the server 300 is updated.
In step 407, the server 300 determines, on the basis of the temperature distribution acquired in step 405, whether the temperature distribution of the target space α has reached the target temperature distribution.
If it is determined in step 407 that the target temperature distribution is not reached (NO in 407), the process returns to step 403, and the individual steps are executed again until the target temperature distribution is reached. On the other hand, if it is determined in step 407 that the target temperature distribution is reached (YES in 407), the process ends. In this way, the process of the air conditioning control system 1 is completed.
Here, in the air conditioning control system 1, a start condition for starting the process and an end condition for ending the process may be defined. Specifically, the process is started when the temperature distribution or the state of the target space α acquired by the various sensors of the air conditioning apparatus 100 satisfies the start condition. When the temperature distribution or the state of the target space α acquired by the various sensors of the air conditioning apparatus 100 satisfies the start condition, the process ends even if the target temperature distribution is not reached. The start condition can be set that, for example, a suction temperature detected by a temperature sensor attached to the air conditioning apparatus 100 is 19° C. or less and a pixel average of a thermal image acquired by the detecting unit 200 is 35 or less. The end condition can be set that, for example, a suction temperature detected by the temperature sensor attached to the air conditioning apparatus 100 is 29° C. or more and a pixel average of a thermal image acquired by the infrared sensor 201 is 150 or more.
(4) Process of Functional Units of Server 300
Next, a process performed by the functional units of the server 300 will be described. FIG. 6 is a flowchart illustrating a procedure of a process performed by the functional units of the server 300 according to the present embodiment. Each of steps is executed by the processor 301.
First, in step 501, the experience acquisition unit 313 of the server 300 transmits, to the air conditioning control unit 314, a temperature distribution of the target space α received from the detecting unit 200 as a state of the target space α.
In step 502, the air conditioning control unit 314 receives, from the experience acquisition unit 313, the temperature distribution as the state of the target space α, and transfers the state to the neural network unit 315.
In step 503, the neural network unit 315 uses the state of the target space α as an input, and outputs Q values for individual actions by using the parameters in the neural network database 309.
In step 504, the neural network unit 315 transmits a list of the Q values for the individual actions to the air conditioning control unit 314.
In step 505, the air conditioning control unit 314 receives the Q values from the neural network unit 315 and determines an action having the largest Q value. The action determined by the air conditioning control unit 314 is transmitted to the experience acquisition unit 313.
In step 506, the experience acquisition unit 313 receives the action from the air conditioning control unit 314 and transmits the received action as a control instruction to the air conditioning apparatus 100.
The air conditioning apparatus 100 performs an air conditioning operation on the basis of the control instruction. After that, in step 507, the experience acquisition unit 313 determines a result and a reward and merges acquired pieces of information. Specifically, the experience acquisition unit 313 merges an original state, an action performed by the air conditioning apparatus 100, a reward, and a new state (result) into one experience. The experience acquisition unit 313 transmits the merged information to the memory database 308.
In step 508, the learning unit 316 updates the weighs and biases of the neural network by new weights and biases in the neural network database 309 on the basis of the experience stored in the memory database 308. Accordingly, the neural network is optimized and updated. The timing at which the learning unit 316 performs updating is not limited thereto. Updating may be performed after all the steps have finished.
In step 509, the experience acquisition unit 313 determines whether the result reaches the target temperature distribution. If it is determined in step 509 that the result does not reach the target temperature distribution (NO in 509), the process returns to step 501, and the experience acquisition process is executed again until the target temperature distribution is reached. On the other hand, if it is determined in step 509 that the result reaches the target temperature distribution (YES in 509), the process performed by the functional units of the server 300 ends.
The present invention is not limited to the above-described embodiment and includes various modifications. For example, the above-described embodiment is a detailed description given for easy understanding of the present invention. The invention is not necessarily limited to an invention including all the configurations described above. Part of a configuration of an embodiment can be replaced with a configuration of another embodiment. Also, a configuration of an embodiment can be added to a configuration of another embodiment. For part of the configuration of each embodiment, another configuration can be added, deleted, or replaced.
Some or all of the above-described configurations, functions, processing units, and so forth may be implemented by hardware by designing them with an integrated circuit, for example. The above-described configurations, functions, and so forth may be implemented by software as a result of a processor interpreting and executing a program that implements the individual functions. Information such as the program implementing the individual functions, tables, and files can be stored in a recording device, such as a memory, a hard disk, or a solid state drive (SSD), or a recording medium, such as an IC card or an SD card. Control lines and information lines that are considered to be necessary for the description are illustrated, and all control lines and information lines of a product are not necessarily illustrated. Almost all configurations may be considered as actually being connected to each other.
(5) Features
(5-1)
The air conditioning control system 1 is a system for controlling the air conditioning apparatus 100 that performs an air conditioning operation in the target space α. The air conditioning control system 1 includes the detecting unit 200 and the server 300. The detecting unit 200 is built in the air conditioning apparatus 100 or is disposed in the target space α and includes the infrared sensor 201. The detecting unit 200 detects a temperature distribution of the target space α on the basis of a thermal image detected by the infrared sensor 201. The server 300 controls the air conditioning apparatus 100 such that the temperature distribution of the target space α approaches a target temperature distribution, on the basis of the temperature distribution detected by the detecting unit 200, by using the learning device that has learned by deep reinforcement learning.
The air conditioning control system 1 updates the learning device on the basis of a temperature distribution of the target space α obtained after the server 300 has controlled the air conditioning apparatus 100. The infrared sensor 201 detects a new thermal image of the target space α after the server 300 has controlled the air conditioning apparatus 100. The detecting unit 200 detects a new temperature distribution of the target space α on the basis of the new thermal image. The server 300 controls the air conditioning apparatus 100 such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
Accordingly, the server 300 determines, by using the learning device that learns as appropriate by deep reinforcement learning, a control instruction for causing the temperature distribution of the target space α to approach the target temperature distribution, and thus the target temperature distribution can be efficiently reached in a short time.
In addition, the server 300 is capable of updating the learning device on the basis of a temperature distribution (result) of the target space α obtained after an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction, and performing new air conditioning control by using the updated learning device. Accordingly, the target temperature distribution can be reached more efficiently in a short time.
(5-2)
The target temperature distribution set in advance in the air conditioning control system 1 is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations. The learning device has learned by deep reinforcement learning on the basis of a value obtained through statistical processing performed on the temperature distribution of the target space α and the target temperature distribution.
The server 300 of the air conditioning control system 1 controls the air conditioning apparatus 100 such that the temperature distribution at a floor surface and a wall surface of the target space α approaches the target temperature distribution.
The server 300 transmits, to the air conditioning apparatus 100, a control instruction for causing the temperature distribution at the floor surface and the wall surface of the target space α to approach the target temperature distribution, thereby being capable of causing the temperature distribution of the target space α to efficiently approach the target temperature distribution obtained through statistical processing.
(5-3)
The server 300 of the air conditioning control system 1 controls at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus 100 to cause the temperature distribution of the target space α to approach the target temperature distribution.
The learning device of the server 300 learns to achieve the target temperature distribution with a simplest possible control instruction. The server 300 transmits a control instruction including any of the above controls to the air conditioning apparatus 100, thereby causing the temperature distribution of the target space α to approach the target temperature distribution.
(5-4)
An air conditioning control method is a method for controlling the air conditioning apparatus 100 that performs an air conditioning operation in the target space α. The air conditioning control method includes an acquisition step 401 of acquiring a thermal image of the target space α, a detection step 402 of detecting a temperature distribution of the target space (α) on the basis of the thermal image, and a control step 404 of controlling the air conditioning apparatus 100 such that the temperature distribution of the target space (α) approaches a target temperature distribution, on the basis of the temperature distribution by using the learning device that has learned by deep reinforcement learning.
(6)
The embodiment of the present disclosure has been described above. It is to be understood that the embodiment and the details can be variously changed without deviating from the gist and scope of the present disclosure described in the claims.

REFERENCE SIGNS LIST

- 1 air conditioning control system
- 100 air conditioning apparatus
- 200 detecting unit
- 201 infrared sensor
- 300 server
- 401 acquisition step
- 402 detection step
- 404 control step
- α target space
- α1 floor surface
- α2 wall surface

CITATION LIST

Patent Literature

<PTL 1> Japanese Unexamined Patent Application Publication No. 2012-184899

Claims

1. An air conditioning control system for controlling an air conditioning apparatus that performs an air conditioning operation in a target space, the air conditioning control system comprising:

a detecting unit configured to detect a temperature distribution of the target space; and

a server configured to control the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, based on the temperature distribution detected by the detecting unit by using a learning device that has learned by deep reinforcement learning.

2. The air conditioning control system according to claim 1, wherein

the detecting unit includes an infrared sensor arranged and configured to detect a thermal image of the target space, and

the detecting unit is configured to detect the temperature distribution of the target space based on the thermal image detected by the infrared sensor.

3. The air conditioning control system according to claim 1, wherein

the learning device is updated based on a temperature distribution of the target space obtained after the server has controlled the air conditioning apparatus,

the detecting unit includes an infrared sensor configured to detect a new thermal image of the target space after the server has controlled the air conditioning apparatus,

the detecting unit is configured to detect a new temperature distribution of the target space based on the new thermal image, and

the server is configured to control the air conditioning apparatus such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.

4. The air conditioning control system according to claim 1, wherein

the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations, and

the learning device has learned by the deep reinforcement learning based on a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.

5. The air conditioning control system according to claim 1, wherein

the server is further configured to control the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.

6. The air conditioning control system according to claim 1, wherein

the detecting unit includes an infrared sensor that is built in the air conditioning apparatus or is disposed in the target space.

7. The air conditioning control system according to claim 1, wherein

the server is further configured to control at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.

8. An air conditioning control method for controlling an air conditioning apparatus that performs an air conditioning operation in a target space, the air conditioning control method comprising:

acquiring a thermal image of the target space;

detecting a temperature distribution of the target space based on the thermal image; and

controlling the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, based on the temperature distribution by using a learning device that has learned by deep reinforcement learning.

9. The air conditioning control system according to claim 2, wherein

the infrared sensor is configured to detect a new thermal image of the target space after the server has controlled the air conditioning apparatus,

10. The air conditioning control system according to claim 2, wherein

11. The air conditioning control system according to claim 2, wherein

12. The air conditioning control system according to claim 2, wherein

the infrared sensor is built in the air conditioning apparatus or is disposed in the target space.

13. The air conditioning control system according to claim 2, wherein

14. The air conditioning control system according to claim 3, wherein

15. The air conditioning control system according to claim 3, wherein

16. The air conditioning control system according to claim 3, wherein

17. The air conditioning control system according to claim 3, wherein

18. The air conditioning control system according to claim 4, wherein

19. The air conditioning control system according to claim 4, wherein

20. The air conditioning control system according to claim 4, wherein