US20220178572A1 - Air conditioning control system and air conditioning control method - Google Patents

Air conditioning control system and air conditioning control method Download PDF

Info

Publication number
US20220178572A1
US20220178572A1 US17/437,393 US202017437393A US2022178572A1 US 20220178572 A1 US20220178572 A1 US 20220178572A1 US 202017437393 A US202017437393 A US 202017437393A US 2022178572 A1 US2022178572 A1 US 2022178572A1
Authority
US
United States
Prior art keywords
air conditioning
temperature distribution
target space
target
control system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/437,393
Inventor
Shouta TANAKA
Takeshi MORINIBU
Tomohiro Noda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daikin Industries Ltd
Original Assignee
Daikin Industries Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daikin Industries Ltd filed Critical Daikin Industries Ltd
Assigned to DAIKIN INDUSTRIES, LTD. reassignment DAIKIN INDUSTRIES, LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TANAKA, SHOUTA, MORINIBU, Takeshi, NODA, TOMOHIRO
Publication of US20220178572A1 publication Critical patent/US20220178572A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/62Control or safety arrangements characterised by the type of control or by internal processing, e.g. using fuzzy logic, adaptive control or estimation of values
    • F24F11/63Electronic processing
    • F24F11/64Electronic processing using pre-stored data
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/50Control or safety arrangements characterised by user interfaces or communication
    • F24F11/56Remote control
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/62Control or safety arrangements characterised by the type of control or by internal processing, e.g. using fuzzy logic, adaptive control or estimation of values
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/62Control or safety arrangements characterised by the type of control or by internal processing, e.g. using fuzzy logic, adaptive control or estimation of values
    • F24F11/63Electronic processing
    • F24F11/65Electronic processing for selecting an operating mode
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/70Control systems characterised by their outputs; Constructional details thereof
    • F24F11/72Control systems characterised by their outputs; Constructional details thereof for controlling the supply of treated air, e.g. its pressure
    • F24F11/74Control systems characterised by their outputs; Constructional details thereof for controlling the supply of treated air, e.g. its pressure for controlling air flow rate or air velocity
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/70Control systems characterised by their outputs; Constructional details thereof
    • F24F11/72Control systems characterised by their outputs; Constructional details thereof for controlling the supply of treated air, e.g. its pressure
    • F24F11/79Control systems characterised by their outputs; Constructional details thereof for controlling the supply of treated air, e.g. its pressure for controlling the direction of the supplied air
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F11/00Control or safety arrangements
    • F24F11/70Control systems characterised by their outputs; Constructional details thereof
    • F24F11/80Control systems characterised by their outputs; Constructional details thereof for controlling the temperature of the supplied air
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F24HEATING; RANGES; VENTILATING
    • F24FAIR-CONDITIONING; AIR-HUMIDIFICATION; VENTILATION; USE OF AIR CURRENTS FOR SCREENING
    • F24F2110/00Control inputs relating to air properties
    • F24F2110/10Temperature

Definitions

  • An air conditioning control system for controlling an air conditioning apparatus and an air conditioning control method for the same.
  • An air conditioning control system is a system for controlling an air conditioning apparatus that performs an air conditioning operation in a target space.
  • the air conditioning control system includes a detecting unit and a server.
  • the detecting unit detects a temperature distribution of the target space.
  • the server controls the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, on the basis of the temperature distribution detected by the detecting unit, by using a learning device that has learned by deep reinforcement learning.
  • the learning device performs learning, by deep reinforcement learning, to cause the temperature distribution of the target space to efficiently approach the target temperature distribution. Accordingly, the temperature distribution of the target space can be caused to efficiently approach the target temperature distribution in a short time.
  • An air conditioning control system is the system according to the first aspect, in which the detecting unit includes an infrared sensor that detects a thermal image of the target space.
  • the detecting unit detects the temperature distribution of the target space on the basis of the thermal image detected by the infrared sensor.
  • An air conditioning control system is the system according to the first aspect or the second aspect, in which the learning device is updated on the basis of a temperature distribution of the target space obtained after the server has controlled the air conditioning apparatus.
  • the infrared sensor detects a new thermal image of the target space after the server has controlled the air conditioning apparatus.
  • the detecting unit detects a new temperature distribution of the target space on the basis of the new thermal image.
  • the server controls the air conditioning apparatus such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
  • An air conditioning control system is the system according to any one of the first aspect to the third aspect, in which the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations.
  • the learning device has learned by the deep reinforcement learning on the basis of a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.
  • An air conditioning control system is the system according to any one of the first aspect to the fourth aspect, in which the server controls the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
  • An air conditioning control system is the system according to any one of the first aspect to the fifth aspect, in which the infrared sensor is built in the air conditioning apparatus or is disposed in the target space.
  • An air conditioning control system is the system according to any one of the first aspect to the sixth aspect, in which the server controls at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
  • An air conditioning control method is a method for controlling an air conditioning apparatus that performs an air conditioning operation in a target space.
  • the air conditioning control method includes a step of detecting a thermal image of the target space, a step of detecting a temperature distribution of the target space on the basis of the thermal image, and a step of controlling the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, on the basis of the temperature distribution by using a learning device that has learned by deep reinforcement learning.
  • FIG. 1 is a schematic diagram of an air conditioning control system.
  • FIG. 2 is a block diagram illustrating the configuration of the air conditioning control system.
  • FIG. 3 is a schematic diagram of a memory database.
  • FIG. 4 is a block diagram illustrating the configuration of functional units of the air conditioning control system.
  • FIG. 5 is a flowchart illustrating a process performed by the entire air conditioning control system.
  • FIG. 6 is a flowchart illustrating a process performed by the functional units of the air conditioning control system.
  • the air conditioning control system 1 includes an air conditioning apparatus 100 , a detecting unit 200 , and a server 300 .
  • An indoor unit of the air conditioning apparatus 100 and the detecting unit 200 are installed in a target space ⁇ in which the air conditioning apparatus 100 performs an air conditioning operation, such as an office or a residence.
  • the air conditioning operation includes cooling, heating, dehumidifying, and the like.
  • the target space ⁇ includes a wall ⁇ 1 , a floor ⁇ 2 , an obstacle ⁇ 3 , and so forth.
  • the server 300 is installed in a management room or the like outside the target space ⁇ .
  • the air conditioning apparatus 100 and the detecting unit 200 are connected to each other via a wireless or wired network N.
  • the air conditioning apparatus 100 and the server 300 are connected to each other via the wireless or wired network N.
  • the air conditioning apparatus 100 illustrated in FIG. 2 performs an air conditioning operation in the target space ⁇ .
  • the air conditioning apparatus 100 mainly includes a control unit 101 , a compressor 102 , a louver 103 , and a fan 104 .
  • the individual components of the air conditioning apparatus 100 may be disposed in either of the indoor unit installed in the target space ⁇ and an outdoor unit installed outside the target space ⁇ .
  • the air conditioning apparatus 100 may include a component other than these components.
  • the control unit 101 controls an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction.
  • the control instruction is received from the server 300 via the network N.
  • the control instruction includes requests or the like for changing an operation mode, a set temperature, an air direction, an air volume, and the like.
  • the control unit 101 performs control of, for example, controlling an output or the like of the compressor 102 to change a set temperature, changing the direction of the louver 103 to change an air direction, controlling an output of the fan 104 to change an air volume, or the like, on the basis of the control instruction.
  • the detecting unit 200 is built in the air conditioning apparatus 100 and is attached to a front surface of the air conditioning apparatus 100 , as illustrated in FIG. 1 .
  • the detecting unit 200 may be attached to the wall ⁇ 1 , the ceiling, or the like in the target space ⁇ .
  • the detecting unit 200 detects a temperature distribution of the target space ⁇ on the basis of a thermal image.
  • the detecting unit 200 includes an infrared sensor 201 .
  • the infrared sensor 201 acquires a two-dimensional thermal image.
  • the infrared sensor 201 includes, for example, a pixel group arranged in a two-dimensional matrix, and has a structure capable of acquiring a plurality of two-dimensional thermal images at one time.
  • the infrared sensor 201 may include a pixel group (line sensor) arranged one-dimensionally and may have a structure of one-dimensionally scanning the pixel group to acquire a two-dimensional thermal image.
  • the infrared sensor 201 may include one or more pixels and may have a structure of two-dimensionally scanning the one or more pixels to acquire a two-dimensional thermal image.
  • the configuration of the infrared sensor 201 is not limited.
  • a thermal image is displayed in such a manner that a portion (pixel) having a higher temperature in the target space ⁇ has a higher density. Accordingly, it is possible to acquire a temperature distribution of the target space ⁇ and determine the temperature distribution of the target space ⁇ . Display of a thermal image is not limited thereto.
  • the temperature distribution and the target temperature distribution herein are based on a value obtained through statistical processing. Specifically, for example, it is obvious that, in a uniformized temperature distribution of the target space ⁇ , the wall ⁇ 1 , the floor ⁇ 2 , the obstacle ⁇ 3 , and so forth in the target space ⁇ do not need to have an identical temperature.
  • the server 300 holds, as a learning device, a deep neural network (DNN) which is a neural network including an input layer, a plurality of intermediate layers, and an output layer.
  • DNN deep neural network
  • the server 300 determines a control instruction to be transmitted to the air conditioning apparatus 100 on the basis of information that is output from the output layer in response to input of input information to the input layer.
  • the server 300 transmits the control instruction to the air conditioning apparatus 100 via the network N to control the air conditioning apparatus 100 .
  • the server 300 is illustrated as a single apparatus. However, it is preferable that the server 300 be compatible with cloud computing. Thus, the hardware configuration of the server 300 need not be accommodated in a single housing or provided as a single apparatus.
  • the server 300 is configured as a result of hardware resources of the server 300 being dynamically connected or disconnected in accordance with a load.
  • the server 300 includes a processor 301 , a memory 302 , and auxiliary storage 307 . These components are connected to each other via a bus.
  • the memory 302 and the auxiliary storage 307 are examples of a storage device.
  • the processor 301 executes various calculation processes while referring to the memory 302 .
  • the memory 302 stores an experience acquisition program 303 , an air conditioning control program 304 , a neural network program 305 , and a learning program 306 .
  • the learning device included in the server 300 executes these various programs to learn and to be updated.
  • the learning device learns, by deep reinforcement learning, a control instruction for making a temperature distribution of the target space ⁇ a target temperature distribution by using simplest possible air conditioning control.
  • the target temperature distribution is a temperature distribution that is set in advance by a user who uses the air conditioning apparatus 100 or a manager who manages the server 300 .
  • the target temperature distribution is, for example, a uniformized temperature distribution or a temperature distribution including temperature variations.
  • the experience acquisition program 303 is a program for acquiring an experience that is acquired by providing a control instruction to the air conditioning apparatus 100 by the server 300 .
  • An experience is represented by, for example, a temperature distribution of the target space ⁇ acquired by the detecting unit 200 (state), a control instruction (action), a reward, or a temperature distribution of the target space ⁇ acquired by the detecting unit 200 after the air conditioning apparatus 100 has performed an air conditioning operation on the basis of a control instruction (result).
  • the experience acquired by the experience acquisition program 303 is stored in a memory database 308 .
  • the air conditioning control program 304 is a program for determining a control instruction of controlling an air conditioning operation performed by the air conditioning apparatus 100 .
  • the control instruction is defined in accordance with the ability, specifications, or a purpose of learning of the air conditioning apparatus 100 .
  • a purpose of learning is causing the learning device to learn to transmit, to the air conditioning apparatus 100 , a control instruction to cause a temperature distribution of the target space ⁇ to approach a uniformized temperature distribution
  • the control instruction includes instructions to change an operation mode, a set temperature, an air direction, and an air volume.
  • the ranges of these control instructions are determined in accordance with the ability and specifications of the air conditioning apparatus 100 .
  • the neural network program 305 has an input which is a temperature distribution of the target space ⁇ , and an output which is a Q value (action evaluation value) of a control instruction to be transmitted to the air conditioning apparatus 100 .
  • the neural network is an evaluation model (or an evaluation function) that determines an evaluation value and the parameters thereof are updated by the learning program 306 as appropriate.
  • the air conditioning control system 1 disclosed below is a system trained by deep reinforcement learning, in which an action evaluation model is represented by a deep neural network.
  • the neural network program 305 can be edited and is customized in accordance with a system to which the program is applied. For example, in the present embodiment, the neural network program 305 first processes a temperature distribution detected by the detecting unit 200 by using convolution and pooling, and extracts features. Furthermore, the neural network program 305 couples the extracted features to a long short-term memory (LSTM), adds a time-series influence, and outputs a result as a Q value.
  • LSTM long short-term memory
  • the learning program 306 updates and optimizes the parameters of the neural network.
  • the learning program 306 optimizes the parameters of the neural network by advantage learning, for example. Accordingly, the neural network becomes capable of more accurately estimating Q values of individual actions in a given state, and the server 300 is capable of determining a more intellectual control instruction.
  • the auxiliary storage 307 stores the memory database 308 and a neural network database 309 .
  • FIG. 3 is a schematic diagram of the memory database 308 according to the present embodiment.
  • the memory database 308 has a limited capacity. The capacity is determined in advance by an engineer or the like. When the memory database 308 becomes full, the first experience in the memory database 308 is deleted and a vacant space for a new experience is formed.
  • the memory database 308 has columns of an index 318 , a state 328 , an action 338 , a reward 348 , and a result 358 .
  • the memory database 308 may have any structure as long as information on experiences can be stored therein.
  • the index 318 indicates integers, which indicate the order of experiences stored in the memory database 308 .
  • the index 318 indicates which of the experiences stored in the memory database 308 is the oldest and is to be deleted when the memory database 308 is full and a new experience is to be stored.
  • the state 328 includes information about the target space ⁇ , that is, information on a temperature distribution acquired from a thermal image.
  • the state 328 may include, for example, values related to the target space ⁇ acquired by various sensors of the air conditioning apparatus 100 .
  • the action 338 indicates positive numerical values. Each number, which is a control instruction transmittable to the air conditioning apparatus 100 by the server 300 and which indicates an action ID, indicates one specific action.
  • the reward 348 indicates numerical values each defining a reward that can be acquired after the state of the target space ⁇ has been changed in response to an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction transmitted by the server 300 .
  • a control instruction to change the direction of the louver 103 is provided to the air conditioning apparatus 100 and a resulting temperature distribution becomes apart from a target temperature distribution, an acquired reward is a negative value.
  • an acquired reward is a positive value.
  • Rewards for individual actions in individual states are set in advance.
  • the result 358 is a transition state after an action (an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction) has been made in a state. Regarding this result, it is defined whether an executed control instruction is able to acquire a reward.
  • the neural network database 309 includes weights and biases of links between nodes in the neural network.
  • a node transmits information to another node by using a weight and a bias.
  • the neural network database 309 is updated so that the neural network is capable of more accurately estimating Q values for individual actions.
  • FIG. 4 is a block diagram illustrating functional units of the server 300 according to Embodiment 1.
  • An experience acquisition unit 313 is implemented by execution of the experience acquisition program 303 by the processor 301 .
  • the experience acquisition unit 313 learns how the environment of the target space ⁇ is changed as a result of an air conditioning operation performed by the air conditioning apparatus 100 in response to a control instruction transmitted by the server 300 .
  • a state, an action, a reward, and a result are merged into an experience, which is transmitted to the memory database 308 .
  • the experience acquisition unit 313 receives a control instruction to be transmitted to the air conditioning apparatus 100 from an air conditioning control unit 314 so as to cause the state of the target space ⁇ to approach a target temperature distribution.
  • the air conditioning control unit 314 determines a control instruction to be transmitted to the air conditioning apparatus 100 .
  • the air conditioning control unit 314 is implemented by execution of the air conditioning control program 304 by the processor 301 .
  • the air conditioning control unit 314 receives, from the experience acquisition unit 313 , a temperature distribution of the target space ⁇ acquired by the detecting unit 200 as a state, transmits the state to a neural network unit 315 , and acquires Q values for individual actions that can be made.
  • the air conditioning control unit 314 may or may not use Q value information to determine an action.
  • the air conditioning control unit 314 has a parameter called epsilon ( ⁇ ) and determines, on the basis of the parameter, whether to use a Q value or search for a random action ( ⁇ -greedy method).
  • the parameter ⁇ is set to a fixed value by a developer in advance, or is decreased from 1 to 0 in proportional to a training time.
  • the air conditioning control unit 314 randomly selects a number, compares the number with the ⁇ value, and determines which of use of a Q value and search for a random action is to be selected.
  • the air conditioning control unit 314 transmits the determined action to the experience acquisition unit 313 .
  • the method of determining an action is not limited thereto.
  • a learning unit 316 optimizes neural network parameters so that the neural network is capable of more accurately estimating Q values for individual actions when an input is a current state of the target space ⁇ .
  • the learning unit 316 is implemented by execution of the learning program 306 by the processor 301 .
  • FIG. 5 is a flowchart illustrating a procedure of a process performed by the entire air conditioning control system 1 according to the present embodiment.
  • step 401 the infrared sensor 201 acquires a thermal image of the target space ⁇ .
  • step 402 the detecting unit 200 detects a temperature distribution of the target space ⁇ on the basis of the thermal image acquired by the infrared sensor 201 .
  • the detected temperature distribution is transmitted to the server 300 via the network N.
  • the server 300 determines a control instruction to be transmitted to the air conditioning apparatus 100 also on the basis of the received temperature distribution.
  • the server 300 determines a control instruction capable of achieving a target temperature distribution with simplest possible control by using the learning device.
  • the server 300 transmits the determined control instruction to the air conditioning apparatus 100 .
  • step 404 the control unit 101 of the air conditioning apparatus 100 controls the individual devices of the air conditioning apparatus 100 so as to perform an air conditioning operation on the basis of the received control instruction.
  • step 405 the detecting unit 200 detects a temperature distribution of the target space ⁇ in a state changed by the air conditioning operation performed by the air conditioning apparatus 100 .
  • the detected temperature distribution is transmitted to the server 300 .
  • step 406 the learning device of the server 300 is updated.
  • step 407 the server 300 determines, on the basis of the temperature distribution acquired in step 405 , whether the temperature distribution of the target space ⁇ has reached the target temperature distribution.
  • step 407 If it is determined in step 407 that the target temperature distribution is not reached (NO in 407 ), the process returns to step 403 , and the individual steps are executed again until the target temperature distribution is reached. On the other hand, if it is determined in step 407 that the target temperature distribution is reached (YES in 407 ), the process ends. In this way, the process of the air conditioning control system 1 is completed.
  • a start condition for starting the process and an end condition for ending the process may be defined.
  • the process is started when the temperature distribution or the state of the target space ⁇ acquired by the various sensors of the air conditioning apparatus 100 satisfies the start condition.
  • the process ends even if the target temperature distribution is not reached.
  • the start condition can be set that, for example, a suction temperature detected by a temperature sensor attached to the air conditioning apparatus 100 is 19° C. or less and a pixel average of a thermal image acquired by the detecting unit 200 is 35 or less.
  • the end condition can be set that, for example, a suction temperature detected by the temperature sensor attached to the air conditioning apparatus 100 is 29° C. or more and a pixel average of a thermal image acquired by the infrared sensor 201 is 150 or more.
  • FIG. 6 is a flowchart illustrating a procedure of a process performed by the functional units of the server 300 according to the present embodiment. Each of steps is executed by the processor 301 .
  • step 501 the experience acquisition unit 313 of the server 300 transmits, to the air conditioning control unit 314 , a temperature distribution of the target space ⁇ received from the detecting unit 200 as a state of the target space ⁇ .
  • step 502 the air conditioning control unit 314 receives, from the experience acquisition unit 313 , the temperature distribution as the state of the target space ⁇ , and transfers the state to the neural network unit 315 .
  • step 503 the neural network unit 315 uses the state of the target space ⁇ as an input, and outputs Q values for individual actions by using the parameters in the neural network database 309 .
  • step 504 the neural network unit 315 transmits a list of the Q values for the individual actions to the air conditioning control unit 314 .
  • the air conditioning control unit 314 receives the Q values from the neural network unit 315 and determines an action having the largest Q value. The action determined by the air conditioning control unit 314 is transmitted to the experience acquisition unit 313 .
  • step 506 the experience acquisition unit 313 receives the action from the air conditioning control unit 314 and transmits the received action as a control instruction to the air conditioning apparatus 100 .
  • the air conditioning apparatus 100 performs an air conditioning operation on the basis of the control instruction. After that, in step 507 , the experience acquisition unit 313 determines a result and a reward and merges acquired pieces of information. Specifically, the experience acquisition unit 313 merges an original state, an action performed by the air conditioning apparatus 100 , a reward, and a new state (result) into one experience. The experience acquisition unit 313 transmits the merged information to the memory database 308 .
  • step 508 the learning unit 316 updates the weighs and biases of the neural network by new weights and biases in the neural network database 309 on the basis of the experience stored in the memory database 308 . Accordingly, the neural network is optimized and updated.
  • the timing at which the learning unit 316 performs updating is not limited thereto. Updating may be performed after all the steps have finished.
  • step 509 the experience acquisition unit 313 determines whether the result reaches the target temperature distribution. If it is determined in step 509 that the result does not reach the target temperature distribution (NO in 509 ), the process returns to step 501 , and the experience acquisition process is executed again until the target temperature distribution is reached. On the other hand, if it is determined in step 509 that the result reaches the target temperature distribution (YES in 509 ), the process performed by the functional units of the server 300 ends.
  • the present invention is not limited to the above-described embodiment and includes various modifications.
  • the above-described embodiment is a detailed description given for easy understanding of the present invention.
  • the invention is not necessarily limited to an invention including all the configurations described above.
  • Part of a configuration of an embodiment can be replaced with a configuration of another embodiment.
  • a configuration of an embodiment can be added to a configuration of another embodiment.
  • another configuration can be added, deleted, or replaced.
  • Some or all of the above-described configurations, functions, processing units, and so forth may be implemented by hardware by designing them with an integrated circuit, for example.
  • the above-described configurations, functions, and so forth may be implemented by software as a result of a processor interpreting and executing a program that implements the individual functions.
  • Information such as the program implementing the individual functions, tables, and files can be stored in a recording device, such as a memory, a hard disk, or a solid state drive (SSD), or a recording medium, such as an IC card or an SD card.
  • Control lines and information lines that are considered to be necessary for the description are illustrated, and all control lines and information lines of a product are not necessarily illustrated. Almost all configurations may be considered as actually being connected to each other.
  • the air conditioning control system 1 is a system for controlling the air conditioning apparatus 100 that performs an air conditioning operation in the target space ⁇ .
  • the air conditioning control system 1 includes the detecting unit 200 and the server 300 .
  • the detecting unit 200 is built in the air conditioning apparatus 100 or is disposed in the target space ⁇ and includes the infrared sensor 201 .
  • the detecting unit 200 detects a temperature distribution of the target space ⁇ on the basis of a thermal image detected by the infrared sensor 201 .
  • the server 300 controls the air conditioning apparatus 100 such that the temperature distribution of the target space ⁇ approaches a target temperature distribution, on the basis of the temperature distribution detected by the detecting unit 200 , by using the learning device that has learned by deep reinforcement learning.
  • the air conditioning control system 1 updates the learning device on the basis of a temperature distribution of the target space ⁇ obtained after the server 300 has controlled the air conditioning apparatus 100 .
  • the infrared sensor 201 detects a new thermal image of the target space ⁇ after the server 300 has controlled the air conditioning apparatus 100 .
  • the detecting unit 200 detects a new temperature distribution of the target space ⁇ on the basis of the new thermal image.
  • the server 300 controls the air conditioning apparatus 100 such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
  • the server 300 determines, by using the learning device that learns as appropriate by deep reinforcement learning, a control instruction for causing the temperature distribution of the target space ⁇ to approach the target temperature distribution, and thus the target temperature distribution can be efficiently reached in a short time.
  • the server 300 is capable of updating the learning device on the basis of a temperature distribution (result) of the target space ⁇ obtained after an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction, and performing new air conditioning control by using the updated learning device. Accordingly, the target temperature distribution can be reached more efficiently in a short time.
  • the target temperature distribution set in advance in the air conditioning control system 1 is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations.
  • the learning device has learned by deep reinforcement learning on the basis of a value obtained through statistical processing performed on the temperature distribution of the target space ⁇ and the target temperature distribution.
  • the server 300 of the air conditioning control system 1 controls the air conditioning apparatus 100 such that the temperature distribution at a floor surface and a wall surface of the target space ⁇ approaches the target temperature distribution.
  • the server 300 transmits, to the air conditioning apparatus 100 , a control instruction for causing the temperature distribution at the floor surface and the wall surface of the target space ⁇ to approach the target temperature distribution, thereby being capable of causing the temperature distribution of the target space ⁇ to efficiently approach the target temperature distribution obtained through statistical processing.
  • the server 300 of the air conditioning control system 1 controls at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus 100 to cause the temperature distribution of the target space ⁇ to approach the target temperature distribution.
  • the learning device of the server 300 learns to achieve the target temperature distribution with a simplest possible control instruction.
  • the server 300 transmits a control instruction including any of the above controls to the air conditioning apparatus 100 , thereby causing the temperature distribution of the target space ⁇ to approach the target temperature distribution.
  • An air conditioning control method is a method for controlling the air conditioning apparatus 100 that performs an air conditioning operation in the target space ⁇ .
  • the air conditioning control method includes an acquisition step 401 of acquiring a thermal image of the target space ⁇ , a detection step 402 of detecting a temperature distribution of the target space ( ⁇ ) on the basis of the thermal image, and a control step 404 of controlling the air conditioning apparatus 100 such that the temperature distribution of the target space ( ⁇ ) approaches a target temperature distribution, on the basis of the temperature distribution by using the learning device that has learned by deep reinforcement learning.

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Mechanical Engineering (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Fluid Mechanics (AREA)
  • Air Conditioning Control Device (AREA)

Abstract

An air conditioning control system controls an air conditioning apparatus that performs an air conditioning operation in a target space. The air conditioning control system includes a detecting unit that detects a temperature distribution of the target space, and a server. The server controls the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, based on the temperature distribution detected by the detecting unit by using a learning device that has learned by deep reinforcement learning.

Description

    TECHNICAL FIELD
  • An air conditioning control system for controlling an air conditioning apparatus and an air conditioning control method for the same.
  • BACKGROUND ART
  • According to PTL 1 (Japanese Unexamined Patent Application Publication No. 2012-184899), air conditioning improvement is performed by an air conditioning operation based on a so-called rule base to approach a target temperature distribution.
  • SUMMARY OF THE INVENTION Technical Problem
  • In an air conditioning operation based on a rule base, a long time is taken to approach a target temperature distribution.
  • Solution to Problem
  • An air conditioning control system according to a first aspect is a system for controlling an air conditioning apparatus that performs an air conditioning operation in a target space. The air conditioning control system includes a detecting unit and a server. The detecting unit detects a temperature distribution of the target space. The server controls the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, on the basis of the temperature distribution detected by the detecting unit, by using a learning device that has learned by deep reinforcement learning.
  • The learning device performs learning, by deep reinforcement learning, to cause the temperature distribution of the target space to efficiently approach the target temperature distribution. Accordingly, the temperature distribution of the target space can be caused to efficiently approach the target temperature distribution in a short time.
  • An air conditioning control system according to a second aspect is the system according to the first aspect, in which the detecting unit includes an infrared sensor that detects a thermal image of the target space. The detecting unit detects the temperature distribution of the target space on the basis of the thermal image detected by the infrared sensor.
  • An air conditioning control system according to a third aspect is the system according to the first aspect or the second aspect, in which the learning device is updated on the basis of a temperature distribution of the target space obtained after the server has controlled the air conditioning apparatus. The infrared sensor detects a new thermal image of the target space after the server has controlled the air conditioning apparatus. The detecting unit detects a new temperature distribution of the target space on the basis of the new thermal image. The server controls the air conditioning apparatus such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
  • An air conditioning control system according to a fourth aspect is the system according to any one of the first aspect to the third aspect, in which the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations. The learning device has learned by the deep reinforcement learning on the basis of a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.
  • An air conditioning control system according to a fifth aspect is the system according to any one of the first aspect to the fourth aspect, in which the server controls the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
  • An air conditioning control system according to a sixth aspect is the system according to any one of the first aspect to the fifth aspect, in which the infrared sensor is built in the air conditioning apparatus or is disposed in the target space.
  • An air conditioning control system according to a seventh aspect is the system according to any one of the first aspect to the sixth aspect, in which the server controls at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
  • An air conditioning control method according to an eighth aspect is a method for controlling an air conditioning apparatus that performs an air conditioning operation in a target space. The air conditioning control method includes a step of detecting a thermal image of the target space, a step of detecting a temperature distribution of the target space on the basis of the thermal image, and a step of controlling the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, on the basis of the temperature distribution by using a learning device that has learned by deep reinforcement learning.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a schematic diagram of an air conditioning control system.
  • FIG. 2 is a block diagram illustrating the configuration of the air conditioning control system.
  • FIG. 3 is a schematic diagram of a memory database.
  • FIG. 4 is a block diagram illustrating the configuration of functional units of the air conditioning control system.
  • FIG. 5 is a flowchart illustrating a process performed by the entire air conditioning control system.
  • FIG. 6 is a flowchart illustrating a process performed by the functional units of the air conditioning control system.
  • DESCRIPTION OF EMBODIMENTS
  • Hereinafter, an air conditioning control system 1 according to an embodiment of the present disclosure will be described. The following embodiment is a specific example, does not limit the technical scope, and can be appropriately changed without deviating from the gist.
  • (1) Overall Configuration
  • As illustrated in FIG. 1, the air conditioning control system 1 includes an air conditioning apparatus 100, a detecting unit 200, and a server 300.
  • An indoor unit of the air conditioning apparatus 100 and the detecting unit 200 are installed in a target space α in which the air conditioning apparatus 100 performs an air conditioning operation, such as an office or a residence. The air conditioning operation includes cooling, heating, dehumidifying, and the like. The target space α includes a wall α1, a floor α2, an obstacle α3, and so forth. The server 300 is installed in a management room or the like outside the target space α.
  • The air conditioning apparatus 100 and the detecting unit 200 are connected to each other via a wireless or wired network N. The air conditioning apparatus 100 and the server 300 are connected to each other via the wireless or wired network N.
  • (2) Detailed Configuration
  • (2-1) Air Conditioning Apparatus 100
  • The air conditioning apparatus 100 illustrated in FIG. 2 performs an air conditioning operation in the target space α. The air conditioning apparatus 100 mainly includes a control unit 101, a compressor 102, a louver 103, and a fan 104. The individual components of the air conditioning apparatus 100 may be disposed in either of the indoor unit installed in the target space α and an outdoor unit installed outside the target space α. The air conditioning apparatus 100 may include a component other than these components.
  • The control unit 101 controls an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction. The control instruction is received from the server 300 via the network N. The control instruction includes requests or the like for changing an operation mode, a set temperature, an air direction, an air volume, and the like. The control unit 101 performs control of, for example, controlling an output or the like of the compressor 102 to change a set temperature, changing the direction of the louver 103 to change an air direction, controlling an output of the fan 104 to change an air volume, or the like, on the basis of the control instruction.
  • (2-2) Detecting Unit 200
  • In the present embodiment, the detecting unit 200 is built in the air conditioning apparatus 100 and is attached to a front surface of the air conditioning apparatus 100, as illustrated in FIG. 1. Alternatively, the detecting unit 200 may be attached to the wall α1, the ceiling, or the like in the target space α.
  • The detecting unit 200 detects a temperature distribution of the target space α on the basis of a thermal image. The detecting unit 200 includes an infrared sensor 201. The infrared sensor 201 acquires a two-dimensional thermal image. The infrared sensor 201 includes, for example, a pixel group arranged in a two-dimensional matrix, and has a structure capable of acquiring a plurality of two-dimensional thermal images at one time. Alternatively, for example, the infrared sensor 201 may include a pixel group (line sensor) arranged one-dimensionally and may have a structure of one-dimensionally scanning the pixel group to acquire a two-dimensional thermal image. Alternatively, the infrared sensor 201 may include one or more pixels and may have a structure of two-dimensionally scanning the one or more pixels to acquire a two-dimensional thermal image. Here, the configuration of the infrared sensor 201 is not limited.
  • A thermal image is displayed in such a manner that a portion (pixel) having a higher temperature in the target space α has a higher density. Accordingly, it is possible to acquire a temperature distribution of the target space α and determine the temperature distribution of the target space α. Display of a thermal image is not limited thereto.
  • The temperature distribution and the target temperature distribution herein are based on a value obtained through statistical processing. Specifically, for example, it is obvious that, in a uniformized temperature distribution of the target space α, the wall α1, the floor α2, the obstacle α3, and so forth in the target space α do not need to have an identical temperature.
  • (2-3) Server 300
  • The server 300 holds, as a learning device, a deep neural network (DNN) which is a neural network including an input layer, a plurality of intermediate layers, and an output layer. The server 300 determines a control instruction to be transmitted to the air conditioning apparatus 100 on the basis of information that is output from the output layer in response to input of input information to the input layer. The server 300 transmits the control instruction to the air conditioning apparatus 100 via the network N to control the air conditioning apparatus 100.
  • In FIG. 1 and FIG. 2, the server 300 is illustrated as a single apparatus. However, it is preferable that the server 300 be compatible with cloud computing. Thus, the hardware configuration of the server 300 need not be accommodated in a single housing or provided as a single apparatus. For example, the server 300 is configured as a result of hardware resources of the server 300 being dynamically connected or disconnected in accordance with a load.
  • The server 300 includes a processor 301, a memory 302, and auxiliary storage 307. These components are connected to each other via a bus. The memory 302 and the auxiliary storage 307 are examples of a storage device.
  • The processor 301 executes various calculation processes while referring to the memory 302. The memory 302 stores an experience acquisition program 303, an air conditioning control program 304, a neural network program 305, and a learning program 306. The learning device included in the server 300 executes these various programs to learn and to be updated. The learning device according to the present embodiment learns, by deep reinforcement learning, a control instruction for making a temperature distribution of the target space α a target temperature distribution by using simplest possible air conditioning control. The target temperature distribution is a temperature distribution that is set in advance by a user who uses the air conditioning apparatus 100 or a manager who manages the server 300. The target temperature distribution is, for example, a uniformized temperature distribution or a temperature distribution including temperature variations.
  • The experience acquisition program 303 is a program for acquiring an experience that is acquired by providing a control instruction to the air conditioning apparatus 100 by the server 300. An experience is represented by, for example, a temperature distribution of the target space α acquired by the detecting unit 200 (state), a control instruction (action), a reward, or a temperature distribution of the target space α acquired by the detecting unit 200 after the air conditioning apparatus 100 has performed an air conditioning operation on the basis of a control instruction (result). The experience acquired by the experience acquisition program 303 is stored in a memory database 308.
  • The air conditioning control program 304 is a program for determining a control instruction of controlling an air conditioning operation performed by the air conditioning apparatus 100. The control instruction is defined in accordance with the ability, specifications, or a purpose of learning of the air conditioning apparatus 100. For example, when a purpose of learning is causing the learning device to learn to transmit, to the air conditioning apparatus 100, a control instruction to cause a temperature distribution of the target space α to approach a uniformized temperature distribution, the control instruction includes instructions to change an operation mode, a set temperature, an air direction, and an air volume. The ranges of these control instructions are determined in accordance with the ability and specifications of the air conditioning apparatus 100.
  • In the present embodiment, the neural network program 305 has an input which is a temperature distribution of the target space α, and an output which is a Q value (action evaluation value) of a control instruction to be transmitted to the air conditioning apparatus 100. The neural network is an evaluation model (or an evaluation function) that determines an evaluation value and the parameters thereof are updated by the learning program 306 as appropriate. The air conditioning control system 1 disclosed below is a system trained by deep reinforcement learning, in which an action evaluation model is represented by a deep neural network. The neural network program 305 can be edited and is customized in accordance with a system to which the program is applied. For example, in the present embodiment, the neural network program 305 first processes a temperature distribution detected by the detecting unit 200 by using convolution and pooling, and extracts features. Furthermore, the neural network program 305 couples the extracted features to a long short-term memory (LSTM), adds a time-series influence, and outputs a result as a Q value.
  • The learning program 306 updates and optimizes the parameters of the neural network. The learning program 306 optimizes the parameters of the neural network by advantage learning, for example. Accordingly, the neural network becomes capable of more accurately estimating Q values of individual actions in a given state, and the server 300 is capable of determining a more intellectual control instruction.
  • The auxiliary storage 307 stores the memory database 308 and a neural network database 309.
  • FIG. 3 is a schematic diagram of the memory database 308 according to the present embodiment. The memory database 308 has a limited capacity. The capacity is determined in advance by an engineer or the like. When the memory database 308 becomes full, the first experience in the memory database 308 is deleted and a vacant space for a new experience is formed. The memory database 308 has columns of an index 318, a state 328, an action 338, a reward 348, and a result 358. The memory database 308 may have any structure as long as information on experiences can be stored therein.
  • The index 318 indicates integers, which indicate the order of experiences stored in the memory database 308. The index 318 indicates which of the experiences stored in the memory database 308 is the oldest and is to be deleted when the memory database 308 is full and a new experience is to be stored.
  • The state 328 includes information about the target space α, that is, information on a temperature distribution acquired from a thermal image. The state 328 may include, for example, values related to the target space α acquired by various sensors of the air conditioning apparatus 100.
  • The action 338 indicates positive numerical values. Each number, which is a control instruction transmittable to the air conditioning apparatus 100 by the server 300 and which indicates an action ID, indicates one specific action.
  • The reward 348 indicates numerical values each defining a reward that can be acquired after the state of the target space α has been changed in response to an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction transmitted by the server 300. For example, when a control instruction to change the direction of the louver 103 is provided to the air conditioning apparatus 100 and a resulting temperature distribution becomes apart from a target temperature distribution, an acquired reward is a negative value. On the other hand, when the foregoing instruction is provided and a resulting temperature distribution approaches a target temperature distribution, an acquired reward is a positive value. Rewards for individual actions in individual states are set in advance.
  • The result 358 is a transition state after an action (an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction) has been made in a state. Regarding this result, it is defined whether an executed control instruction is able to acquire a reward.
  • The neural network database 309 includes weights and biases of links between nodes in the neural network. A node transmits information to another node by using a weight and a bias. As a result of optimizing the weights and biases by advantage learning, the neural network database 309 is updated so that the neural network is capable of more accurately estimating Q values for individual actions.
  • FIG. 4 is a block diagram illustrating functional units of the server 300 according to Embodiment 1.
  • An experience acquisition unit 313 is implemented by execution of the experience acquisition program 303 by the processor 301. The experience acquisition unit 313 learns how the environment of the target space α is changed as a result of an air conditioning operation performed by the air conditioning apparatus 100 in response to a control instruction transmitted by the server 300. A state, an action, a reward, and a result are merged into an experience, which is transmitted to the memory database 308. The experience acquisition unit 313 receives a control instruction to be transmitted to the air conditioning apparatus 100 from an air conditioning control unit 314 so as to cause the state of the target space α to approach a target temperature distribution.
  • The air conditioning control unit 314 determines a control instruction to be transmitted to the air conditioning apparatus 100. The air conditioning control unit 314 is implemented by execution of the air conditioning control program 304 by the processor 301. The air conditioning control unit 314 receives, from the experience acquisition unit 313, a temperature distribution of the target space α acquired by the detecting unit 200 as a state, transmits the state to a neural network unit 315, and acquires Q values for individual actions that can be made.
  • The air conditioning control unit 314 may or may not use Q value information to determine an action. The air conditioning control unit 314 has a parameter called epsilon (ε) and determines, on the basis of the parameter, whether to use a Q value or search for a random action (ε-greedy method). The parameter ε is set to a fixed value by a developer in advance, or is decreased from 1 to 0 in proportional to a training time. The air conditioning control unit 314 randomly selects a number, compares the number with the ε value, and determines which of use of a Q value and search for a random action is to be selected. The air conditioning control unit 314 transmits the determined action to the experience acquisition unit 313. The method of determining an action is not limited thereto.
  • A learning unit 316 optimizes neural network parameters so that the neural network is capable of more accurately estimating Q values for individual actions when an input is a current state of the target space α. The learning unit 316 is implemented by execution of the learning program 306 by the processor 301.
  • (3) Process of Air Conditioning Control System 1
  • FIG. 5 is a flowchart illustrating a procedure of a process performed by the entire air conditioning control system 1 according to the present embodiment.
  • First, in step 401, the infrared sensor 201 acquires a thermal image of the target space α.
  • In step 402, the detecting unit 200 detects a temperature distribution of the target space α on the basis of the thermal image acquired by the infrared sensor 201. The detected temperature distribution is transmitted to the server 300 via the network N.
  • In step 403, the server 300 determines a control instruction to be transmitted to the air conditioning apparatus 100 also on the basis of the received temperature distribution. Here, the server 300 determines a control instruction capable of achieving a target temperature distribution with simplest possible control by using the learning device. The server 300 transmits the determined control instruction to the air conditioning apparatus 100.
  • In step 404, the control unit 101 of the air conditioning apparatus 100 controls the individual devices of the air conditioning apparatus 100 so as to perform an air conditioning operation on the basis of the received control instruction.
  • In step 405, the detecting unit 200 detects a temperature distribution of the target space α in a state changed by the air conditioning operation performed by the air conditioning apparatus 100. The detected temperature distribution is transmitted to the server 300.
  • In step 406, the learning device of the server 300 is updated.
  • In step 407, the server 300 determines, on the basis of the temperature distribution acquired in step 405, whether the temperature distribution of the target space α has reached the target temperature distribution.
  • If it is determined in step 407 that the target temperature distribution is not reached (NO in 407), the process returns to step 403, and the individual steps are executed again until the target temperature distribution is reached. On the other hand, if it is determined in step 407 that the target temperature distribution is reached (YES in 407), the process ends. In this way, the process of the air conditioning control system 1 is completed.
  • Here, in the air conditioning control system 1, a start condition for starting the process and an end condition for ending the process may be defined. Specifically, the process is started when the temperature distribution or the state of the target space α acquired by the various sensors of the air conditioning apparatus 100 satisfies the start condition. When the temperature distribution or the state of the target space α acquired by the various sensors of the air conditioning apparatus 100 satisfies the start condition, the process ends even if the target temperature distribution is not reached. The start condition can be set that, for example, a suction temperature detected by a temperature sensor attached to the air conditioning apparatus 100 is 19° C. or less and a pixel average of a thermal image acquired by the detecting unit 200 is 35 or less. The end condition can be set that, for example, a suction temperature detected by the temperature sensor attached to the air conditioning apparatus 100 is 29° C. or more and a pixel average of a thermal image acquired by the infrared sensor 201 is 150 or more.
  • (4) Process of Functional Units of Server 300
  • Next, a process performed by the functional units of the server 300 will be described. FIG. 6 is a flowchart illustrating a procedure of a process performed by the functional units of the server 300 according to the present embodiment. Each of steps is executed by the processor 301.
  • First, in step 501, the experience acquisition unit 313 of the server 300 transmits, to the air conditioning control unit 314, a temperature distribution of the target space α received from the detecting unit 200 as a state of the target space α.
  • In step 502, the air conditioning control unit 314 receives, from the experience acquisition unit 313, the temperature distribution as the state of the target space α, and transfers the state to the neural network unit 315.
  • In step 503, the neural network unit 315 uses the state of the target space α as an input, and outputs Q values for individual actions by using the parameters in the neural network database 309.
  • In step 504, the neural network unit 315 transmits a list of the Q values for the individual actions to the air conditioning control unit 314.
  • In step 505, the air conditioning control unit 314 receives the Q values from the neural network unit 315 and determines an action having the largest Q value. The action determined by the air conditioning control unit 314 is transmitted to the experience acquisition unit 313.
  • In step 506, the experience acquisition unit 313 receives the action from the air conditioning control unit 314 and transmits the received action as a control instruction to the air conditioning apparatus 100.
  • The air conditioning apparatus 100 performs an air conditioning operation on the basis of the control instruction. After that, in step 507, the experience acquisition unit 313 determines a result and a reward and merges acquired pieces of information. Specifically, the experience acquisition unit 313 merges an original state, an action performed by the air conditioning apparatus 100, a reward, and a new state (result) into one experience. The experience acquisition unit 313 transmits the merged information to the memory database 308.
  • In step 508, the learning unit 316 updates the weighs and biases of the neural network by new weights and biases in the neural network database 309 on the basis of the experience stored in the memory database 308. Accordingly, the neural network is optimized and updated. The timing at which the learning unit 316 performs updating is not limited thereto. Updating may be performed after all the steps have finished.
  • In step 509, the experience acquisition unit 313 determines whether the result reaches the target temperature distribution. If it is determined in step 509 that the result does not reach the target temperature distribution (NO in 509), the process returns to step 501, and the experience acquisition process is executed again until the target temperature distribution is reached. On the other hand, if it is determined in step 509 that the result reaches the target temperature distribution (YES in 509), the process performed by the functional units of the server 300 ends.
  • The present invention is not limited to the above-described embodiment and includes various modifications. For example, the above-described embodiment is a detailed description given for easy understanding of the present invention. The invention is not necessarily limited to an invention including all the configurations described above. Part of a configuration of an embodiment can be replaced with a configuration of another embodiment. Also, a configuration of an embodiment can be added to a configuration of another embodiment. For part of the configuration of each embodiment, another configuration can be added, deleted, or replaced.
  • Some or all of the above-described configurations, functions, processing units, and so forth may be implemented by hardware by designing them with an integrated circuit, for example. The above-described configurations, functions, and so forth may be implemented by software as a result of a processor interpreting and executing a program that implements the individual functions. Information such as the program implementing the individual functions, tables, and files can be stored in a recording device, such as a memory, a hard disk, or a solid state drive (SSD), or a recording medium, such as an IC card or an SD card. Control lines and information lines that are considered to be necessary for the description are illustrated, and all control lines and information lines of a product are not necessarily illustrated. Almost all configurations may be considered as actually being connected to each other.
  • (5) Features
  • (5-1)
  • The air conditioning control system 1 is a system for controlling the air conditioning apparatus 100 that performs an air conditioning operation in the target space α. The air conditioning control system 1 includes the detecting unit 200 and the server 300. The detecting unit 200 is built in the air conditioning apparatus 100 or is disposed in the target space α and includes the infrared sensor 201. The detecting unit 200 detects a temperature distribution of the target space α on the basis of a thermal image detected by the infrared sensor 201. The server 300 controls the air conditioning apparatus 100 such that the temperature distribution of the target space α approaches a target temperature distribution, on the basis of the temperature distribution detected by the detecting unit 200, by using the learning device that has learned by deep reinforcement learning.
  • The air conditioning control system 1 updates the learning device on the basis of a temperature distribution of the target space α obtained after the server 300 has controlled the air conditioning apparatus 100. The infrared sensor 201 detects a new thermal image of the target space α after the server 300 has controlled the air conditioning apparatus 100. The detecting unit 200 detects a new temperature distribution of the target space α on the basis of the new thermal image. The server 300 controls the air conditioning apparatus 100 such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
  • Accordingly, the server 300 determines, by using the learning device that learns as appropriate by deep reinforcement learning, a control instruction for causing the temperature distribution of the target space α to approach the target temperature distribution, and thus the target temperature distribution can be efficiently reached in a short time.
  • In addition, the server 300 is capable of updating the learning device on the basis of a temperature distribution (result) of the target space α obtained after an air conditioning operation performed by the air conditioning apparatus 100 on the basis of a control instruction, and performing new air conditioning control by using the updated learning device. Accordingly, the target temperature distribution can be reached more efficiently in a short time.
  • (5-2)
  • The target temperature distribution set in advance in the air conditioning control system 1 is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations. The learning device has learned by deep reinforcement learning on the basis of a value obtained through statistical processing performed on the temperature distribution of the target space α and the target temperature distribution.
  • The server 300 of the air conditioning control system 1 controls the air conditioning apparatus 100 such that the temperature distribution at a floor surface and a wall surface of the target space α approaches the target temperature distribution.
  • The server 300 transmits, to the air conditioning apparatus 100, a control instruction for causing the temperature distribution at the floor surface and the wall surface of the target space α to approach the target temperature distribution, thereby being capable of causing the temperature distribution of the target space α to efficiently approach the target temperature distribution obtained through statistical processing.
  • (5-3)
  • The server 300 of the air conditioning control system 1 controls at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus 100 to cause the temperature distribution of the target space α to approach the target temperature distribution.
  • The learning device of the server 300 learns to achieve the target temperature distribution with a simplest possible control instruction. The server 300 transmits a control instruction including any of the above controls to the air conditioning apparatus 100, thereby causing the temperature distribution of the target space α to approach the target temperature distribution.
  • (5-4)
  • An air conditioning control method is a method for controlling the air conditioning apparatus 100 that performs an air conditioning operation in the target space α. The air conditioning control method includes an acquisition step 401 of acquiring a thermal image of the target space α, a detection step 402 of detecting a temperature distribution of the target space (α) on the basis of the thermal image, and a control step 404 of controlling the air conditioning apparatus 100 such that the temperature distribution of the target space (α) approaches a target temperature distribution, on the basis of the temperature distribution by using the learning device that has learned by deep reinforcement learning.
  • (6)
  • The embodiment of the present disclosure has been described above. It is to be understood that the embodiment and the details can be variously changed without deviating from the gist and scope of the present disclosure described in the claims.
  • REFERENCE SIGNS LIST
      • 1 air conditioning control system
      • 100 air conditioning apparatus
      • 200 detecting unit
      • 201 infrared sensor
      • 300 server
      • 401 acquisition step
      • 402 detection step
      • 404 control step
      • α target space
      • α1 floor surface
      • α2 wall surface
    CITATION LIST Patent Literature
    • <PTL 1> Japanese Unexamined Patent Application Publication No. 2012-184899

Claims (20)

1. An air conditioning control system for controlling an air conditioning apparatus that performs an air conditioning operation in a target space, the air conditioning control system comprising:
a detecting unit configured to detect a temperature distribution of the target space; and
a server configured to control the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, based on the temperature distribution detected by the detecting unit by using a learning device that has learned by deep reinforcement learning.
2. The air conditioning control system according to claim 1, wherein
the detecting unit includes an infrared sensor arranged and configured to detect a thermal image of the target space, and
the detecting unit is configured to detect the temperature distribution of the target space based on the thermal image detected by the infrared sensor.
3. The air conditioning control system according to claim 1, wherein
the learning device is updated based on a temperature distribution of the target space obtained after the server has controlled the air conditioning apparatus,
the detecting unit includes an infrared sensor configured to detect a new thermal image of the target space after the server has controlled the air conditioning apparatus,
the detecting unit is configured to detect a new temperature distribution of the target space based on the new thermal image, and
the server is configured to control the air conditioning apparatus such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
4. The air conditioning control system according to claim 1, wherein
the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations, and
the learning device has learned by the deep reinforcement learning based on a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.
5. The air conditioning control system according to claim 1, wherein
the server is further configured to control the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
6. The air conditioning control system according to claim 1, wherein
the detecting unit includes an infrared sensor that is built in the air conditioning apparatus or is disposed in the target space.
7. The air conditioning control system according to claim 1, wherein
the server is further configured to control at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
8. An air conditioning control method for controlling an air conditioning apparatus that performs an air conditioning operation in a target space, the air conditioning control method comprising:
acquiring a thermal image of the target space;
detecting a temperature distribution of the target space based on the thermal image; and
controlling the air conditioning apparatus such that the temperature distribution of the target space approaches a target temperature distribution, based on the temperature distribution by using a learning device that has learned by deep reinforcement learning.
9. The air conditioning control system according to claim 2, wherein
the learning device is updated based on a temperature distribution of the target space obtained after the server has controlled the air conditioning apparatus,
the infrared sensor is configured to detect a new thermal image of the target space after the server has controlled the air conditioning apparatus,
the detecting unit is configured to detect a new temperature distribution of the target space based on the new thermal image, and
the server is configured to control the air conditioning apparatus such that the new temperature distribution approaches the target temperature distribution, by using the updated learning device.
10. The air conditioning control system according to claim 2, wherein
the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations, and
the learning device has learned by the deep reinforcement learning based on a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.
11. The air conditioning control system according to claim 2, wherein
the server is further configured to control the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
12. The air conditioning control system according to claim 2, wherein
the infrared sensor is built in the air conditioning apparatus or is disposed in the target space.
13. The air conditioning control system according to claim 2, wherein
the server is further configured to control at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
14. The air conditioning control system according to claim 3, wherein
the target temperature distribution is a uniformized temperature distribution or a temperature distribution including predetermined temperature variations, and
the learning device has learned by the deep reinforcement learning based on a value obtained through statistical processing performed on the temperature distribution of the target space and the target temperature distribution.
15. The air conditioning control system according to claim 3, wherein
the server is further configured to control the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
16. The air conditioning control system according to claim 3, wherein
the infrared sensor is built in the air conditioning apparatus or is disposed in the target space.
17. The air conditioning control system according to claim 3, wherein
the server is further configured to control at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
18. The air conditioning control system according to claim 4, wherein
the server is further configured to control the air conditioning apparatus such that the temperature distribution at a floor surface and a wall surface of the target space approaches the target temperature distribution.
19. The air conditioning control system according to claim 4, wherein
the detecting unit includes an infrared sensor that is built in the air conditioning apparatus or is disposed in the target space.
20. The air conditioning control system according to claim 4, wherein
the server is further configured to control at least any of an operation mode, a set temperature, an air direction, and an air volume in the air conditioning operation performed by the air conditioning apparatus to cause the temperature distribution of the target space to approach the target temperature distribution.
US17/437,393 2019-03-13 2020-03-06 Air conditioning control system and air conditioning control method Abandoned US20220178572A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2019045791A JP7071307B2 (en) 2019-03-13 2019-03-13 Air conditioning control system and air conditioning control method
JP2019-045791 2019-03-13
PCT/JP2020/009760 WO2020184454A1 (en) 2019-03-13 2020-03-06 Air-conditioning control system and air-conditioning control method

Publications (1)

Publication Number Publication Date
US20220178572A1 true US20220178572A1 (en) 2022-06-09

Family

ID=72426082

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/437,393 Abandoned US20220178572A1 (en) 2019-03-13 2020-03-06 Air conditioning control system and air conditioning control method

Country Status (5)

Country Link
US (1) US20220178572A1 (en)
EP (1) EP3940306A4 (en)
JP (1) JP7071307B2 (en)
CN (1) CN113544439A (en)
WO (1) WO2020184454A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4220029A4 (en) * 2020-09-24 2024-04-03 Mitsubishi Electric Corp Air conditioner and air conditioning system
WO2022145417A1 (en) * 2020-12-29 2022-07-07 三菱電機株式会社 Learning device, air conditioner, communication terminal, air conditioning system, and method for learning control of air conditioner
EP4336784A1 (en) * 2021-05-07 2024-03-13 Daikin Industries, Ltd. Method and apparatus for controlling environment adjusting apparatus, and intelligent environment adjusting system
CN113203070B (en) * 2021-05-17 2022-12-20 佛山市爱居光电有限公司 LED infrared induction lamp with emergency function
CN114923265B (en) * 2022-07-20 2022-09-30 湖南工商大学 Central air conditioning energy-saving control system based on Internet of things

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150041550A1 (en) * 2013-08-12 2015-02-12 Azbil Corporation Air conditioning controlling device and method
US20160033162A1 (en) * 2014-08-04 2016-02-04 Mitsubishi Electric Corporation Indoor unit for air-conditioning apparatus
US20180100662A1 (en) * 2016-10-11 2018-04-12 Mitsubishi Electric Research Laboratories, Inc. Method for Data-Driven Learning-based Control of HVAC Systems using High-Dimensional Sensory Observations
JP2020106153A (en) * 2018-12-26 2020-07-09 株式会社日立製作所 Air-conditioning control system and method

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05312381A (en) * 1992-05-06 1993-11-22 Res Dev Corp Of Japan Air conditioning system
JP5312381B2 (en) 2010-03-17 2013-10-09 能美防災株式会社 Alarm and alarm system
JP2012141104A (en) * 2011-01-04 2012-07-26 Mitsubishi Electric Corp Air conditioner and method for controlling operaton of the same
JP5673226B2 (en) 2011-03-07 2015-02-18 富士通株式会社 Air conditioning improvement system
KR101947156B1 (en) * 2012-10-12 2019-02-13 엘지전자 주식회사 A display device of an air conditioner
JP6807556B2 (en) * 2015-10-01 2021-01-06 パナソニックIpマネジメント株式会社 Air conditioning control method, air conditioning control device and air conditioning control program
JP2018048750A (en) * 2016-09-20 2018-03-29 株式会社東芝 Air conditioning control device, air conditioning control method, and air conditioning control program
JP2018071853A (en) * 2016-10-27 2018-05-10 インフォグリーン株式会社 Learning device, control device, learning method, control method, learning program, and control program
JP6824785B2 (en) 2017-03-09 2021-02-03 アズビル株式会社 Air conditioning control system and method
JP7053180B2 (en) * 2017-07-11 2022-04-12 株式会社東芝 Information processing equipment, information processing methods, programs, and information processing systems
CN107730006B (en) * 2017-09-13 2021-01-05 重庆电子工程职业学院 Building near-zero energy consumption control method based on renewable energy big data deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150041550A1 (en) * 2013-08-12 2015-02-12 Azbil Corporation Air conditioning controlling device and method
US20160033162A1 (en) * 2014-08-04 2016-02-04 Mitsubishi Electric Corporation Indoor unit for air-conditioning apparatus
US20180100662A1 (en) * 2016-10-11 2018-04-12 Mitsubishi Electric Research Laboratories, Inc. Method for Data-Driven Learning-based Control of HVAC Systems using High-Dimensional Sensory Observations
JP2020106153A (en) * 2018-12-26 2020-07-09 株式会社日立製作所 Air-conditioning control system and method

Also Published As

Publication number Publication date
EP3940306A4 (en) 2022-04-27
EP3940306A1 (en) 2022-01-19
CN113544439A (en) 2021-10-22
WO2020184454A1 (en) 2020-09-17
JP7071307B2 (en) 2022-05-18
JP2020148385A (en) 2020-09-17

Similar Documents

Publication Publication Date Title
US20220178572A1 (en) Air conditioning control system and air conditioning control method
JP6616791B2 (en) Information processing apparatus, information processing method, and computer program
US11268713B2 (en) Smart home air conditioner automatic control system based on artificial intelligence
JP7279445B2 (en) Prediction method, prediction program and information processing device
US11808473B2 (en) Action optimization device, method and program
US20160098631A1 (en) Apparatus and method for learning a model corresponding to time-series input data
CN110332671B (en) Control method, device and equipment of indoor unit and air conditioning system
JP7231403B2 (en) Air conditioning control system and method
WO2018193934A1 (en) Evaluation apparatus, evaluation method, and program therefor
JP6400834B2 (en) RECOMMENDATION DEVICE, RECOMMENDATION DETERMINING METHOD, AND COMPUTER PROGRAM
US20190278242A1 (en) Training server and method for generating a predictive model for controlling an appliance
CN112204581A (en) Learning device, deduction device, method and program
KR102038703B1 (en) Method for estimation on online multivariate time series using ensemble dynamic transfer models and system thereof
US20220154960A1 (en) Air-conditioning control device, air-conditioning system, air-conditioning control method, and non-transitory computer readable recording medium
JP2017215157A (en) Information processing device, program, and fall prediction system
CN109839889A (en) Equipment recommendation system and method
CN103077184A (en) Method for rule-based context acquisition
KR20220075123A (en) Analysis Service System for Battery Condition of Electric Bus
US20220137578A1 (en) Control system for equipment device
JP2020123139A (en) Information processing system, terminal device, client device, control method thereof, program, and storage medium
CN116795066A (en) Communication data processing method, system, server and medium of remote IO module
JP7145649B2 (en) Air-conditioning control device, air-conditioning control system, air-conditioning control method and program
US20220154961A1 (en) Control method, computer-readable recording medium storing control program, and air conditioning control device
CN111316313A (en) Collation support system, collation support method, and program
US10698370B2 (en) Device control method, device control apparatus and device control system

Legal Events

Date Code Title Description
AS Assignment

Owner name: DAIKIN INDUSTRIES, LTD., JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TANAKA, SHOUTA;MORINIBU, TAKESHI;NODA, TOMOHIRO;SIGNING DATES FROM 20200603 TO 20200604;REEL/FRAME:057417/0965

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION