CN113825356A

CN113825356A - Energy-saving control method and device for cold source system, electronic equipment and storage medium

Info

Publication number: CN113825356A
Application number: CN202110856943.4A
Authority: CN
Inventors: 林依挺; 吴俊杰; 夏恒; 贾庆山; 王宇恒; 唐静娴; 陆翔
Original assignee: Tsinghua University; Tencent Technology Shenzhen Co Ltd
Current assignee: Tsinghua University; Tencent Technology Shenzhen Co Ltd
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2021-12-21
Anticipated expiration: 2041-07-28
Also published as: CN113825356B

Abstract

The embodiment of the application discloses an energy-saving control method and device of a cold source system, electronic equipment and a storage medium; the method and the device for controlling the cold source system can acquire the current state quantity of the cold source system and the target control strategy of the preset control model, predict the predicted state quantity of multiple dimensions of the cold source system in a target time interval according to the current state quantity and the target control strategy, fuse the predicted state quantity of the multiple dimensions, perform revenue calculation on the fused predicted state quantity according to a preset reward function and preset constraint conditions, adopt the preset control model again, determine the total benefit value of the cold source system in a preset time period based on the revenue value, adjust the target control strategy according to the total revenue when the total benefit value does not meet the preset conditions, continue prediction by taking the adjusted control strategy as the target control strategy, and output the trained control model for controlling the cold source system when the total benefit value meets the preset conditions. The scheme can effectively realize the energy-saving control of the cold source system.

Description

Energy-saving control method and device for cold source system, electronic equipment and storage medium

Technical Field

The application relates to the technical field of computers, in particular to an energy-saving control method and device for a cold source system, electronic equipment and a storage medium.

Background

The data center is a machine room used for placing servers in the technical field of communication and information. The data center is used to transmit, accelerate, display, calculate, and store data information over the internet infrastructure. With continuous innovation and development of information technology and increasing material culture requirements of people, more and more enterprises are gradually aware that data processing, storage and exchange have great influence on the value of the enterprises, data becomes the most important assets of the enterprises, and data centers are in a period of rapid development. Data center usually needs to consume a large amount of electric energy to produce a large amount of heat, however, in the prior art, only through air conditioning system to data center cooling of dispelling heat, can't satisfy the real-time and effectual energy-conserving control requirement to data center, not enough energy-concerving and environment-protective.

Disclosure of Invention

The embodiment of the application provides an energy-saving control method and device for a cold source system, an electronic device and a storage medium, which can effectively realize energy-saving control of the cold source system and greatly reduce energy consumption of the cold source system.

The embodiment of the application provides an energy-saving control method of a cold source system, which comprises the following steps:

acquiring a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current time interval and a state quantity of a preset time interval before the current time interval;

predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity;

performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period;

determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period;

when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy;

and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system.

Correspondingly, the embodiment of the present application further provides an energy saving control device for a cooling source system, including:

the system comprises an acquisition unit, a control unit and a control unit, wherein the acquisition unit is used for acquiring the current state quantity of a cold source system and a target control strategy of a preset control model, and the current state quantity comprises the current time interval and the state quantity of the preset time interval before the current time interval;

the prediction unit is used for predicting the prediction state quantities of multiple dimensions of the cold source system in a target time interval according to the current state quantity and a target control strategy, and fusing the prediction state quantities of the multiple dimensions to obtain the fused prediction state quantity;

the calculating unit is used for carrying out income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period;

the determining unit is used for determining a total benefit value of the cold source system in a preset time period based on the benefit value by adopting a preset control model, wherein the preset time period comprises at least one target time period;

the adjusting unit is used for adjusting the target control strategy according to the total income to obtain an adjusted control strategy when the total profit value does not meet a preset condition, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in a target time interval by taking the adjusted control strategy as the target control strategy;

and the control unit is used for outputting a post-training control model to control the cold source system when the total benefit value meets a preset condition.

Optionally, in some embodiments, the prediction unit includes a determination subunit, a first prediction subunit, a second prediction subunit, and a fusion subunit, as follows:

the determining subunit is configured to determine a target control quantity currently adopted according to the current state quantity and a target control strategy;

the first prediction subunit is configured to predict a first predicted state quantity of the cold source system in a target time period by using a data driving submodel, where the data driving submodel is a data model obtained based on historical data of the cold source system, and the historical data includes a historical state quantity and a historical control quantity of the cold source system;

the second prediction subunit is used for predicting a second prediction state quantity of the cold source system in a target time period by using a mechanism energy consumption submodel, wherein the mechanism energy consumption submodel is a physical model established based on reference energy consumption of internal equipment of the cold source system;

and the fusion subunit is used for fusing the first prediction state quantity and the second prediction state quantity to obtain a fused prediction state quantity.

Optionally, in some embodiments, the mechanism energy consumption sub-model includes a chiller energy consumption module, a chilled water pump energy consumption module, and a cooling water pump energy consumption module, the second prediction state quantity includes a second prediction chiller power, a second prediction chilled water pump power, and a second prediction cooling water pump power, and the second prediction sub-unit includes a first module, a second module, and a third module, as follows:

the first module is used for predicting the second predicted water chilling unit power of the cold source system in a target time period by using the water chilling unit energy consumption module;

the second module is used for predicting the second predicted chilled water pump power of the cold source system in the target time period by using the chilled water pump energy consumption module;

and the third module is used for predicting the second predicted cooling water pump power of the cold source system in the target time period by using the cooling water pump energy consumption module.

Optionally, in some embodiments, the current state quantity includes a target cooling water outlet water temperature and a target chilled water return water temperature, the target control quantity includes a target chilled water outlet water temperature and a target chilled water flow rate, and the first module may be specifically configured to obtain a cold water model parameter of the water chiller energy consumption module; and calculating the second predicted water chilling unit power of the cold source system in a target time period based on the target cooling water outlet water temperature, the target chilled water return water temperature, the target chilled water outlet water temperature, the target chilled water flow and the cold water model parameters.

Optionally, in some embodiments, the target control amount includes a target chilled water pump flow rate, and the second module may be specifically configured to obtain a refrigeration model parameter of the chilled water pump energy consumption module; and calculating a second predicted chilled water pump power of the cold source system in a target time period based on the target chilled water pump flow and the freezing model parameters.

Optionally, in some embodiments, the target control quantity includes a target cooling water pump flow, and the third module may be specifically configured to obtain a cooling model parameter of the cooling water pump energy consumption module; and calculating second predicted cooling water pump power of the cold source system in a target time period based on the target cooling water pump flow and the cooling model parameters.

Optionally, in some embodiments, the fusion subunit may be specifically configured to determine a first weight of the first predicted state quantity based on the prediction errors of the data driving submodel and the mechanism energy consumption submodel, and determine a second weight of the second predicted state quantity based on the prediction errors of the data driving submodel and the mechanism energy consumption submodel; and fusing the first prediction state quantity and the second prediction state quantity based on the first weight and the second weight to obtain a fused prediction state quantity.

Optionally, in some embodiments, the calculation unit includes a first calculation subunit, a second calculation subunit, and a third calculation subunit, as follows:

the first calculating subunit is configured to perform reward calculation on the fused predicted state quantity according to a preset reward function to obtain a reward value of the cold source system in a target time period;

the second calculating subunit is configured to perform penalty calculation on the fused predicted state quantity based on a preset constraint condition, so as to obtain a penalty value of the cold source system in a target time period;

and the third calculation subunit is used for calculating a benefit value of the cold source system in a target time period based on the reward value and the penalty value.

Optionally, in some embodiments, the post-fusion predicted state quantity includes post-fusion predicted chiller power, post-fusion predicted chilled water pump power, and post-fusion predicted cooling water pump power, and the first calculating subunit may be specifically configured to determine an incentive weight of the post-fusion predicted state quantity; and calculating the reward value of the cold source system in the target time period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the reward weight.

Optionally, in some embodiments, the second calculating subunit may be specifically configured to obtain a load of the data center device, a cooling capacity coefficient of the cooling source system, and a cooling capacity of the cooling source system in a target time period; and calculating the punishment value of the cold source system in the target time period based on the equipment load, the refrigerating capacity coefficient and the refrigerating capacity of the data center.

Optionally, in some embodiments, the target time interval is a time interval next to a current time interval, and the determining unit may be specifically configured to update the target control strategy of the preset control model based on the profit value to obtain an updated control strategy; taking the target time interval as the current time interval, taking the updated control strategy as the target control strategy of the preset control model, and continuing to predict the next time interval of the cold source system until the income value of each time interval in the preset time interval is obtained; and determining the total benefit value of the cold source system in the preset time period based on the benefit value of each time period in the preset time period.

Optionally, in some embodiments, the control unit may be specifically configured to obtain a current state quantity of the cold source system, where the current state quantity includes a current time period and a state quantity of a preset time period before the current time period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy.

In addition, a computer-readable storage medium is provided, where the computer-readable storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to perform the steps in the energy saving control method for any cooling source system provided in the embodiments of the present application.

In addition, an embodiment of the present application further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor executes the computer program to implement the steps in the energy saving control method for any one of the cooling source systems provided in the embodiments of the present application.

According to an aspect of the present application, there is provided a computer program product or a computer program, the computer program product or the computer program comprising computer instructions stored in a computer readable storage medium, the computer instructions being read by a processor of a computer device from the computer readable storage medium, the computer instructions being executed by the processor to cause the computer device to perform the method provided in the various alternative implementations of the energy saving control aspect of the heat sink system described above.

The present embodiment can obtain a current state quantity of a cold source system and a target control strategy of a preset control model, where the current state quantity includes a current time period and a state quantity of a preset time period before the current time period; then, predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity; then, performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period; determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy; and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system. The scheme can effectively realize the energy-saving control of the cold source system and greatly reduce the energy consumption of the cold source system.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1a is a schematic view of a scenario of a method for controlling energy saving of a cooling source system according to an embodiment of the present application;

FIG. 1b is a first flowchart illustrating a method for controlling energy saving of a heat sink system according to an embodiment of the present application;

FIG. 1c is a schematic structural diagram of a heat sink system according to an embodiment of the present disclosure;

FIG. 1d is a second flowchart of a method for controlling energy saving of a cooling source system according to an embodiment of the present application;

fig. 1e is a schematic diagram of a simulation model of a cooling source system according to an embodiment of the present application;

FIG. 2a is a third flowchart of a method for controlling energy saving of a cooling source system according to an embodiment of the present application;

FIG. 2b is a fourth flowchart of a method for controlling energy saving of a cooling source system according to an embodiment of the present application;

FIG. 3 is a schematic structural diagram of an energy-saving control device of a cooling source system according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The principles of the present application are illustrated as being implemented in a suitable computing environment. In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific form set forth herein, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.

The term "unit" as used herein may be considered a software object executing on the computing system. The various components, units, engines, and services described herein may be viewed as objects of implementation on the computing system. The apparatus and method described herein may be implemented in software, or may be implemented in hardware, and are within the scope of the present application.

The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but rather, some embodiments may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The embodiment of the application provides an energy-saving control method and device for a cold source system, electronic equipment and a storage medium. The energy-saving control device of the cold source system can be integrated in electronic equipment, and the electronic equipment can be a server or a terminal and the like.

The energy-saving control method of the cold source system provided by the embodiment of the application relates to the machine learning technology in the field of artificial intelligence, and can be used for training a control model and a strategy by utilizing the machine learning of the artificial intelligence, so that the corresponding control quantity is output according to the working condition of the cold source system, the energy-saving control of the cold source system of the data center is realized, and the efficient control strategy optimization is realized.

Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence software technology mainly comprises a computer vision technology, a machine learning/deep learning direction and the like.

Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

For example, as shown in fig. 1a, first, the electronic device integrated with the energy saving control apparatus of a cold source system may acquire a current state quantity of the cold source system and a target control strategy of a preset control model, where the current state quantity includes a current time period and a state quantity of a preset time period before the current time period; then, predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity; then, performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period; determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy; and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system. According to the scheme, the sensor monitoring data accumulated by the data center can be utilized to learn and construct a simulation model of the cold source system, then based on the simulation model, the operation constraint, the energy consumption optimization target and the like of the system are considered, the reward function design is carried out, the reinforcement learning algorithm is utilized to optimize the control strategy, and finally the energy conservation of the cold source system is realized. According to the scheme, modeling and optimization can be performed on the data center cold source system, and more energy-saving cold source system control parameters can be obtained under the constraint of proper refrigerating capacity according to the fluctuation of equipment load in a data center machine room, so that the reduction of energy consumption is realized. And the data driving model and the mechanism driving model are combined, so that the precision and the interpretability are improved. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the operation of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously updated in an iterative manner, so that the real-time dynamic adjustment can be realized, and the system has good adaptability.

The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.

In this embodiment, the energy-saving control device of the cold source system will be described from the perspective of the energy-saving control device, and the energy-saving control device of the cold source system may be specifically integrated in an electronic device, where the electronic device may be a server or a terminal; the terminal may include a mobile phone, a tablet Computer, a notebook Computer, a Personal Computer (PC), and other devices.

An energy-saving control method of a cold source system comprises the following steps: acquiring a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current time interval and a state quantity of a preset time interval before the current time interval; then, predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity; then, performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period; determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy; and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system.

As shown in fig. 1b, a specific process of the energy saving control method of the cooling source system may be as follows:

101. the method comprises the steps of obtaining the current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises the current time interval and the state quantity of the preset time interval before the current time interval.

The cold source system may refer to a system capable of dissipating and cooling heat of devices such as servers in a data center room or other devices. For example, for a data center, a cold source system is used as a cold source of a terminal machine room of the data center, and the overall energy efficiency of the data center is greatly affected, so that it is necessary to explore a model of the cold source system and optimize the control of the cold source system by combining a related optimization method, thereby realizing energy saving of the data center.

For example, the data center cold source system can be composed of a water chilling unit, a chilled water pump, a cooling tower and other devices. As shown in fig. 1c, fig. 1c shows a basic structure of a cooling source system, which includes a water chiller, a chilled water circulation pump, a cooling tower, and so on. Three cold circulation are formed among the devices, including: the system comprises a chilled water cycle, a refrigerant cycle and a cooling water cycle, wherein the three cycles are mutually coupled and mutually influenced. In order to realize the energy conservation of the cold source system, the control quantities such as the temperature, the flow and the like of the chilled water and the cooling water can be optimized, and the energy consumption of the water chilling unit, the chilled water pump and the cooling water pump is reduced. The cold source system can include a plurality of cooling water sets, and a plurality of cooling water sets can include a plurality of frozen water pumps and a plurality of cooling water pump, and the cooling water set can not be the one-to-one correspondence with frozen water pump, cooling water pump, and a cooling water set does not necessarily correspond a frozen water pump and a cooling water pump promptly, specifically can dispose according to actual demand.

In order to improve the accuracy of the control of the cold source system, the system states at the current time and a plurality of previous time periods and the adopted control amount can be selected to research the cold source system, so that a more optimized, more efficient and more energy-saving control strategy is obtained. Therefore, the current state quantity may include the current period and a state quantity of a preset period before the current period.

A large number of sensors are generally deployed in a data center cold source system, data have certain redundancy, and a large number of monitoring data are irrelevant to energy saving optimization of the cold source system, so that a plurality of key features can be selected, for example, the data can be divided into three categories according to units to which the data belong: the state quantity of the chiller, the state quantity of the water pump, the environment and the external variables are subdivided into 11 items, for example, the state quantity of the considered cold source system can be shown in the following table 1:

table 1: the quantities of state considered in the study

The main control amount in the data center cold source system can be roughly divided into two aspects of temperature and flow. For example, the temperature aspect may include the outlet water temperature of chilled water of a water chilling unit and the return water temperature of cooling water; the flow aspects may include chiller chilled water and chilled water flow, for example, the control for the contemplated cold sink system may be as shown in table 2 below:

table 2: control quantities considered in the study

Wherein, the control strategy can refer to a strategy and a method for controlling the cold source system. For example, a control model, such as a preset control model, may be preset, and then a control strategy of the preset control model may be optimized by using a reinforcement learning algorithm to find an optimal control strategy, so as to control the cold source system, thereby implementing optimization of the cold source system.

For example, the current control amount to be adopted may be specifically determined according to the current state amount of the cold source system and the target control strategy.

The preset time period may be set in various manners, for example, the preset time period may be flexibly set according to the requirements of the actual application, or may be preset and stored in the electronic device. In addition, the preset time period may be built in the electronic device, or may be saved in a memory and transmitted to the electronic device, or the like. The preset time period may refer to a preset number of time periods, for example, the preset time period may be 5 time periods, that is, the preset time period before the current time period may be 5 time periods before the current time period, and so on.

102. And predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain the fused predicted state quantity.

For modeling of the data center, a data-driven method can be adopted, namely learning of neural network parameters by using historical data. For example, a deep neural network can be trained by directly utilizing historical data, and a pure data driven model is constructed; a linear model can also be considered, but the linear model has no support of physical mechanism and lacks interpretability. Generally speaking, the data-driven model can capture the numerical relationship between the input quantity and the output quantity, and has more accurate prediction on small-range fluctuation, but the interpretability is poor; the mechanism model can better depict the relation between the energy consumption and the related control variable, has strong interpretability and can accurately predict the large variation trend. Therefore, to improve the accuracy of cold source system prediction, a data-driven + mechanism is combined, i.e., a purely data-based model is combined with a mechanism model having a physics basis. The data driving method utilizes the artificial neural network to fit historical data, can better capture the incidence relation between features, is easy to generate overfitting conditions, and has poor generalization capability. The mechanism driving method has the advantages that the water chilling unit, the freezing/cooling water pump and other equipment have referable energy consumption physical models, model parameters can be fitted by utilizing historical data, the relation between energy consumption and key characteristics can be well reflected, the generalization capability is strong, and the change response to secondary characteristics is slow. The two are fused, so that the advantages of the two can be combined, and the model is more accurate.

In order to improve the efficiency of energy-saving control of the cooling source system, a simulation model may be constructed first, for example, as shown in fig. 1d, the historical data may be preprocessed, and key features may be selected. And then, constructing a cold source system model based on the processed historical data to obtain a simulation model capable of simulating the running state of the real cold source system (for example, inputting the current system state and the controlled variable into the simulation model, and hopefully obtaining a corresponding new state), predicting the predicted state quantities of the cold source system in multiple dimensions in a target period, and fusing the predicted state quantities in the multiple dimensions. For example, the simulation model may be a fusion model of data-driven and mechanism energy consumption, that is, the simulation model may include a data-driven sub-model, a mechanism energy consumption sub-model, and a fusion sub-model (i.e., fusion of outputs of the data-driven sub-model and the mechanism energy consumption sub-model) in a preset control model. For example, as shown in fig. 1e, the state quantities, control quantities, and the like of the system at the current and several previous steps (i.e., the current time period and a preset time period before the current time period) are input into the model, and it is desired to output the system state quantity of the next time period. And based on the constructed simulation model, considering the operation constraint, the energy consumption optimization target and the like of the system, designing a reward function, optimizing the control strategy of the cold source system by using a reinforcement learning method, and performing corresponding model verification and strategy test to improve the accuracy of the preset control model and further optimize the energy conservation of the cold source system of the data center.

For example, the preset control model may include a data driving sub-model and a mechanism energy consumption sub-model, and specifically, the target control quantity currently adopted may be determined according to the current state quantity and the target control strategy; predicting a first prediction state quantity of the cold source system in a target period by using a data driving submodel, wherein the data driving submodel is a data model obtained based on historical data training of the cold source system, and the historical data comprises the historical state quantity and the historical control quantity of the cold source system; predicting a second predicted state quantity of the cold source system in a target period by utilizing a mechanism energy consumption submodel, wherein the mechanism energy consumption submodel is a physical model established based on reference energy consumption of internal equipment of the cold source system; and fusing the first prediction state quantity and the second prediction state quantity to obtain a fused prediction state quantity. The first predicted state quantity may represent a relationship between a current state quantity of the cold source system and a state quantity of the target period, and the second predicted state quantity may represent a relationship between energy consumption of the cold source system and the current state quantity.

For example, the preset control model may include a fusion sub-model, and specifically, the first predicted state quantity and the second predicted state quantity may be fused by using the fusion sub-model to obtain a fused predicted state quantity.

The data-driven submodel may specifically fit a state transfer function, such as f (-) in the following formula, based on historical data by using a machine learning method, for example, an artificial neural network (recurrent neural network and its variants), a regression tree (XGBoost), or other methods may be used.

For example, the system state of the t period is S_tThe considered state is shown in Table 1, and the control amount adopted is a_tThe control amounts considered are shown in table 2. Considering the multi-stage influence of the control quantity, taking the system states of the current and previous periods and the adopted control quantity as input, outputting the system state of the next period, namely:

S_t+1＝f(S_t-L+1，a_t-L+1，...，S_t，a_t)

where L is the considered window length (i.e. the preset time period), i.e. the length of the cycle that may affect the next state quantity, which value may be determined according to the specific characteristics of the system. Predicting a first prediction state quantity of the cold source system in a target period by using a data driving sub-model, wherein the first prediction state quantity comprises a first prediction cold water set power, a first prediction freezing water pump power and a first prediction cooling water pump power, namely predicting the first prediction cold water set power of the next period to be P'_chAnd the first predicted chilled water pump power is P'_chpAnd the first predicted cooling water pump power is P'_cp。

The device of the cold source system is provided with a referable energy consumption physical model, and the mechanism energy consumption sub-model can specifically fit model parameters of the energy consumption physical model in the device of the cold source system by using historical data, for example, a water chilling unit, a chilled water pump and a cooling water pump have relatively clear and simple physical mechanism models, and the mechanism model can be used for assisting in improving the prediction accuracy.

For example, the mechanism energy consumption submodel includes a chiller energy consumption module, a chilled water pump energy consumption module and a cooling water pump energy consumption module, the second prediction state quantity includes a second prediction chiller power, a second prediction chilled water pump power and a second prediction cooling water pump power, and specifically, the chiller energy consumption module may be used to predict the second prediction chiller power of the cold source system in the target time period; predicting a second predicted chilled water pump power of the cold source system in a target time period by using a chilled water pump energy consumption module; and predicting second predicted cooling water pump power of the cold source system in a target time period by using a cooling water pump energy consumption module.

The energy consumption of the water chilling unit is related to factors such as condensing temperature, evaporating temperature and chilled water flow. The condensing temperature in the water chilling unit can be expressed by the outlet temperature of cooling water, and the evaporating temperature can be expressed by the outlet temperature of chilled water. Therefore, a chiller energy consumption model (i.e., a chiller energy consumption module) about the cooling water outlet temperature, the chilled water outlet temperature and the chiller load can be provided. For example, the current state quantity includes a target cooling water outlet water temperature and a target chilled water return water temperature, the target control quantity includes a target chilled water outlet water temperature and a target chilled water flow, and a cold water model parameter of the water chilling unit energy consumption module can be specifically obtained; and calculating the second predicted water chilling unit power of the cold source system in a target time period based on the target cooling water outlet water temperature, the target chilled water return water temperature, the target chilled water outlet water temperature, the target chilled water flow and the cold water model parameters.

For example, the specific expression of the energy consumption module of the water chilling unit may be as follows:

wherein Q is_ch＝(T_chwr-T_chws)*m_chw，α_iAnd i is 0, 1, 5, which is a parameter to be determined by the energy consumption model of the water chilling unit. P ″)_chFor the second predicted chiller power, i.e. the second predicted chiller input power, T_cwsIs the outlet water temperature of cooling water, T_chwsIs the outlet water temperature of the chilled water, T_chwrIs the return water temperature of the chilled water, m_chwIs the chilled water flow.

The power of the chilled water pump is related to factors such as the flow rate, the lift and the efficiency of the chilled water pump, and the lift and the efficiency are constant values, so that an energy consumption model of the chilled water pump can be established to be related to the flow rate. For example, the target control quantity includes a target chilled water pump flow rate, and specifically, a refrigeration model parameter of the chilled water pump energy consumption module may be obtained; and calculating a second predicted chilled water pump power of the cold source system in a target time period based on the target chilled water pump flow and the freezing model parameters.

For example, the specific expression of the energy consumption module of the chilled water pump may be as follows:

wherein f is_iAnd i is 0, 1 and 2, which are parameters to be determined by the energy consumption model of the chilled water pump. P ″)_chpFor the second predicted chilled water pump power, i.e. the second predicted chilled water pump input power, m_chpIs the chilled water pump flow rate. Wherein m is_chwAnd m_chpThe same quantity, i.e. chiller chilled water flow, i.e. chilled water pump flow, is characterized.

The power of the cooling water pump is related to factors such as the flow rate, the lift and the efficiency of the cooling water pump, and the lift and the efficiency are constant values, so that the energy consumption model of the cooling water pump can be established to be related to the flow rate. For example, the target control quantity includes a target cooling water pump flow, and specifically, a cooling model parameter of the cooling water pump energy consumption module may be obtained; and calculating second predicted cooling water pump power of the cold source system in a target time period based on the target cooling water pump flow and the cooling model parameters.

For example, the specific expression of the cooling water pump energy consumption module may be as follows:

wherein, g_iAnd i is 0, and 1 and 2 are parameters to be determined of the cooling water pump energy consumption model. P ″)_cpFor a second prediction of cooling water pump power, i.e. a second prediction of cooling water pump inputPower, m_cpThe flow rate of the cooling water pump.

Based on the mechanism model, historical data can be used to fit the model parameters to be determined.

Because the devices of the cold source system of the data center are correlated with each other, and the state quantity and the control quantity are coupled with each other, the actual energy consumption has a great relationship with the operation condition. Generally speaking, the data driving model can better depict the relation between the input quantity and the output quantity, and can accurately predict the fluctuation in a small range; the mechanism model can better depict the relation between the energy consumption and related variables, can accurately predict the large variation trend, and has strong interpretability. Therefore, in order to improve the precision of the simulation model, bagging (bagging) operation is performed on the output of the final energy consumption power, namely the data driving sub-model and the mechanism energy consumption sub-model are weighted respectively, and the final predicted value of the input power of the water chilling unit, the predicted value of the input power of the freezing water pump and the predicted value of the input power of the cooling water pump in the next time period are obtained after combination, namely the power of the water chilling unit is predicted after fusion, the power of the freezing water pump is predicted after fusion and the power of the cooling water pump is predicted after fusion. For example, a first weight of the first prediction state quantity may be determined based on prediction errors of the data driver submodel and the mechanism energy consumption submodel, and a second weight of the second prediction state quantity may be determined based on prediction errors of the data driver submodel and the mechanism energy consumption submodel; and fusing the first prediction state quantity and the second prediction state quantity based on the first weight and the second weight to obtain a fused prediction state quantity. For example, the specific fusion mode may be as follows:

P_ch＝θ₁P″_ch+(1-θ₁)P′_ch

P_chp＝θ₂P″_chp+(1-θ₂)P′_chp

P_cp＝θ₃P″_cp+(1-θ₃)P′_cp

wherein, P'_chIs the first predicted chiller power, P'_chpFor the first prediction of the work of the chilled water pumpRate, P'_cpThe first predicted cooling water pump power. P ″)_chFor the second prediction of chiller power, P ″)_chpFor the second prediction of chilled water pump power, P ″)_cpThe second predicted cooling water pump power.

The determination of the weight factor θ may depend on the prediction errors of the two models, and the model with the smaller prediction error is given a higher weight, which is specifically determined as follows:

therein, loss_ch′And loss_ch″Respectively representing the prediction errors, loss of the power of the water chilling unit in the data driving submodel and the mechanism energy consumption submodel_chp′And loss_chp″Respectively represents the prediction error of the chilled water pump power in the data driving sub-model and the mechanism energy consumption sub-model, loss_cp′And loss_cp″And respectively representing the prediction errors of the cooling water pump power in the data driving sub-model and the mechanism energy consumption sub-model.

The chiller power mentioned in the embodiments may refer to chiller input power, the chilled water pump power may refer to chilled water pump input power, and the cooling water pump power may refer to cooling water pump input power. For example, the first predicted chiller power may refer to a first predicted chiller input power, the first predicted chilled water pump power may refer to a first predicted chilled water pump input power, and so on.

The Bagging algorithm (also called Bagging algorithm) is a group learning algorithm in the field of machine learning. Bagging is a technique to reduce the generalization error by combining several models. The main idea is to train several different models separately and then let all models vote on the output of the test sample. This is an example of a conventional strategy in machine learning, called model averaging (modeaverging). Techniques that employ this strategy are referred to as integration methods.

103. And performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period.

For example, reward calculation may be specifically performed on the fused predicted state quantity according to a preset reward function, so as to obtain a reward value of the cold source system in a target time period; based on a preset constraint condition, carrying out punishment calculation on the fused predicted state quantity to obtain a punishment value of the cold source system in a target time interval; and calculating the benefit value of the cold source system in a target time period based on the reward value and the penalty value.

The preset constraint condition may include a preset cooling water return temperature constraint, a preset cooling capacity constraint, a preset operation condition constraint, and the like.

For example, based on the constructed simulation model, the operation constraint, the energy consumption optimization goal and the like of the system are considered, the reward function design is carried out, the control strategy of the cold source system can be optimized by using a reinforcement learning method, and corresponding verification is carried out. The basic mathematical model of reinforcement learning is a Markov Decision Process (MDP), which generally includes a quintuple, where M ═ S, a, P, R, γ >, where S denotes a state space of the system (i.e., a space formed by state quantities of the system), a denotes an action space (i.e., a space formed by all possible values of the control quantity), P denotes a system state transition probability (i.e., a function f (·) represents a system state transition probability P), R denotes a reward function, and γ denotes a discount factor. The modeling of the markov decision process of the problem will be described below with respect to the energy-saving optimization problem of the cold source system of the data center. The state space, motion space and system state transition probability of the system are described in step 102, where system constraints and reward functions are introduced.

(1) Cooling water return temperature restraint:

in the actual system operation, the return water temperature T of the cooling water_cwrShould not be lower than the wet bulb temperature T_sqNamely:

T_cwr≥T_sq

when the return water temperature of the cooling water is lower than the wet bulb temperature according to the given control strategy, forcibly updating the return water temperature of the cooling water to be the wet bulb temperature:

T_cwr＝max(T_sq，T_cwr)

(2) and (3) refrigerating capacity constraint:

the refrigerating capacity of the cold source system must be enough to maintain the temperature inside the data center machine room within a reasonable range, so that the situation of overhigh temperature is avoided. Generally, the higher the load of the IT equipment (for example, when the cold source system is the cold source system of the data center, the more the IT equipment is the data center equipment), the more the heat production amount is, and the more the cooling amount is needed. The cooling capacity of the cold source system is given by:

C＝c(T_chwr-T_chws)m_ch

wherein c is the specific heat capacity of water, T_chwrAnd T_chwsIs the return water temperature and the outlet water temperature of the chilled water m_chIs the chilled water flow. The refrigeration capacity should satisfy the following constraints:

C≥δL_IT

wherein, delta is the refrigerating capacity coefficient, L_ITIs the IT equipment load.

Since the refrigerating capacity is related to a plurality of control quantities, it is difficult to directly make rigid constraints on each control quantity. For the cooling capacity constraint, a form of soft constraint is therefore used to be embodied in the reward function, and penalties are given when the cooling capacity does not satisfy the inequality, as will be seen in detail below.

(3) And (3) restricting the operation condition:

the operation of the data center cold source system must be based on safety, and over-excitation control should be avoided. To this end, it is possible to restrict the value of all the control quantities to within a threshold range that has historically appeared, thus ensuring that all the controls fluctuate within a safe range:

a^l≤a_t≤a^u

in the above formula, a^lAnd a^uA lower limit value and an upper limit value of the control amount a in history are respectively.

In addition, when strategy optimization is performed by using a reinforcement learning algorithm, control quantities are searched, and a given combination of control quantities may not be historically present, so that the system state may exceed a historical threshold. Due to the limited generalization capability of the simulation model, when the system state exceeds the historical threshold, the predicted state value may not be accurate. Therefore, the state of the system is constrained to a certain range of historical thresholds, namely:

τs^l≤s_t≤τs^u

in the above formula, s^lAnd s^uThe lower limit value and the upper limit value of the control quantity s in history are respectively, and tau is a threshold coefficient. When the system state exceeds the threshold, training is terminated and a large penalty is given.

Then, calculating the reward value of the cold source system in a target time period, for example, the fused predicted state quantity comprises the fused predicted water chilling unit power, the fused predicted chilled water pump power and the fused predicted cooling water pump power, and determining the reward weight of the fused predicted state quantity; and calculating the reward value of the cold source system in the target time period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the reward weight.

For example, to achieve the energy saving effect, the objective of reducing the energy consumption of the cooling source system is to design the reward function as follows:

wherein,

and

and predicting the power of the water chilling unit after fusion, the power of the freezing water pump after fusion and the power of the cooling water pump after fusion in the period of t respectively, wherein R is a proper positive value constant, and alpha is a proper positive value weight.

Then, a penalty value of the cold source system in the target period can be calculated, for example, the equipment load of the data center, the refrigerating capacity coefficient of the cold source system and the refrigerating capacity of the cold source system in the target period can be obtained specifically; and calculating the punishment value of the cold source system in the target time period based on the equipment load, the refrigerating capacity coefficient and the refrigerating capacity of the data center.

For example, the cooling capacity constraint may be embodied as a penalty function, which is as follows:

r_t＝βmax(δL_IT-C_t，0)

where β is a suitable positive weight, C_tIs the cooling capacity for time period t. Combining the reward function and the penalty function to obtain a single step profit value of R_t-r_t。

104. And determining a total benefit value of the cold source system in a preset time period based on the benefit value by adopting a preset control model, wherein the preset time period comprises at least one target time period.

The target time interval may be a next time interval of the current time interval, for example, the target control strategy of the preset control model may be specifically updated based on the profit value to obtain an updated control strategy; taking the target time interval as the current time interval, taking the updated control strategy as the target control strategy of the preset control model, and continuing to predict the next time interval of the cold source system until the income value of each time interval in the preset time interval is obtained; and determining the total benefit value of the cold source system in the preset time period based on the benefit value of each time period in the preset time period.

For example, the energy-saving optimization problem of the cold source system is converted into an optimization problem that maximizes the total profit under the condition that the preset constraint condition is satisfied, and the optimization problem may be specifically as follows:

where d is the control strategy and γ is the discount factor, which may be a value of 0 to 1, characterizing the current impact of future revenues. The total benefit value may refer to an accumulated value of single step benefits.

105. And when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy.

The preset condition may be set in various ways, for example, the preset condition may be flexibly set according to the requirements of the actual application, or may be preset and stored in the electronic device. In addition, the preset condition may be built in the electronic device, or may be stored in a memory and transmitted to the electronic device, or the like. For example, the preset condition may be a preset reinforcement learning round number M, that is, a preset learning round number, and the like.

For example, when the total benefit value does not satisfy a preset condition, that is, when the total benefit does not reach a preset number of learning rounds M in reinforcement learning, the target control strategy is adjusted according to the total benefit to obtain an adjusted control strategy, and the adjusted control strategy is used as the target control strategy to continuously predict the predicted state quantities of the cold source system in multiple dimensions in a target time period until the preset number of learning rounds is reached in reinforcement learning.

106. And when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system.

For example, the post-training control model may be obtained when the total profit value satisfies a preset condition, that is, when the total profit is not intensively learned to a preset number of learning rounds M. Then, acquiring the current state quantity of the cold source system, wherein the current state quantity comprises the current time period and the state quantity of a preset time period before the current time period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy. For example, the trained control model may be used to determine a system control strategy of the cold source system, determine a system control amount currently adopted by the cold source system based on the current state amount and the system control strategy, and control the cold source system based on the system control amount.

It should be noted that, suitable reinforcement learning algorithms can be selected according to actual situations, including but not limited to Q-learning, DQN, Actor-Critic, DDPG, TRPO, PPO, SAC, and other reinforcement learning algorithms.

Through the verification of a numerical experiment, the method provided by the scheme can realize that the energy-saving proportion of a cold source system of the data center is about 1% -5%, and the adjustment direction of the control strategy summarized according to the learned strategy accords with the expert experience.

It should be noted that, in order to improve the safety of the energy-saving control of the cooling source system, the data storage is stored in the block chain in the above method. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.

The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process and visual output of real-time states in product operation, such as: alarm, monitoring network conditions, monitoring node equipment health status, and the like.

The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use.

As can be seen from the above, the present embodiment may obtain the current state quantity of the cold source system and the target control strategy of the preset control model, where the current state quantity includes the state quantity of the current time period and the preset time period before the current time period; then, predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity; then, performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period; determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy; and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system. The scheme can consider the energy conservation of a cold source system of the data center, and provides an energy conservation optimization method of the cold source system of the data center based on reinforcement learning from the control of the temperature and the flow of chilled water and cooling water of a water chilling unit and a cooling tower. Firstly, monitoring data by using sensors accumulated in a data center are utilized, a cold source system simulation model is learned and constructed, then based on the simulation model, the operation constraint, the energy consumption optimization target and the like of the system are considered, a reward function is designed, a control strategy is optimized by using a reinforcement learning algorithm, and finally, the energy conservation of the cold source system is realized. According to the scheme, modeling and optimization can be performed on the data center cold source system, more energy-saving cold source system control parameters can be obtained under the constraint of proper refrigerating capacity according to the fluctuation of IT load in a data center machine room, and the reduction of energy consumption is realized. And the data driving model and the mechanism driving model are combined, so that the precision and the interpretability are improved. Because the cold source system of the data center is large in scale and complex in structure, the difficulty in constructing a system simulation model is large. The 'data + mechanism' fusion model can play the value of historical data on one hand, and mine the incidence relation between variables by combining with an artificial neural network; on the other hand, the characteristic of strong interpretability of the mechanism model can be exerted, so that the model has more credibility. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the operation of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously updated in an iterative manner, so that the real-time dynamic adjustment can be realized, and the system has good adaptability.

The method described in the previous embodiment is further detailed by way of example.

In this embodiment, the energy-saving control device of the cooling source system is specifically integrated in an electronic device, and the cooling source system is specifically a cooling source systemThe cold source system of the data center has a preset time period, specifically a T time period, a preset condition, specifically a preset learning turn number M, and the current state of the cold source system is s_tThe target control strategy is d, which is described as an example.

Firstly, a data driving submodel may be established, specifically as follows:

in order to improve the efficiency of energy-saving control of the cold source system, the data driving submodel can be trained first. The data driving submodel may be trained from a plurality of historical data of the cold source system. The energy-saving control device of the cold source system can be provided for training by other equipment, or the energy-saving control device of the cold source system can train by itself. For example, the electronic device may train the data-driven submodels using artificial neural networks (recurrent neural networks and variants thereof), regression trees (XGBoost), and the like.

(II) next, determining model parameters of the mechanism energy consumption submodel, which can be specifically as follows:

in order to improve the efficiency of energy-saving control of the cold source system, the model parameters of the mechanism energy consumption sub-model can be determined firstly. The model parameters of the mechanism energy consumption submodel may be determined according to a plurality of historical data of the heat sink system, such as historical state quantity and historical control quantity of the heat sink system. Specifically, the model parameters may be provided to the energy-saving control device of the cold source system after being trained by other devices, or may be automatically trained by the energy-saving control device of the cold source system to determine the model parameters of the mechanism energy consumption submodel.

And thirdly, energy-saving control can be realized on the cold source system by using the determined model parameters of the data driving submodel and the mechanism energy consumption submodel, which can be specifically shown in fig. 2a and fig. 2 b.

As shown in fig. 2a, a specific process of an energy saving control method of a cooling source system may be as follows:

201. the electronic equipment initializes reinforcement learning parameters of a preset control model.

For example, the electronic device may specifically initialize reinforcement learning parameters, including: initialization of a neural network or equation for estimating a state value function, a learning rate, the number M of learning rounds, the number T of round steps, and the like, sets the round count M to 0. And setting the step counting t in the turn to be 0, and initializing the system state st.

202. The electronic equipment acquires the current state quantity of the cold source system and a target control strategy of a preset control model.

Wherein the current state quantity comprises a current time interval and a state quantity of a preset time interval before the current time interval. For example, the electronic device may specifically obtain the input power of a chiller, the input power of a chilled water pump, and the input power of a cooling water pump of the cold source system, and determine the current control amount to be adopted according to the current state amount of the cold source system and a target control strategy of a preset control model. Then, according to the current state s_tAnd a strategy d of selecting the control action a_t(ii) a The current state s_tAnd selected action a_tInputting the data into a pre-constructed simulation model, namely a pre-established data driving sub-model, a pre-constructed mechanism energy consumption sub-model and a pre-constructed fusion sub-model; then the simulation model outputs the updated state s', and let s_tAnd s'. Specifically, steps 203 to 205 may be described as follows.

203. The electronic device predicts a first predicted state quantity of the cold source system in a next period by using the data driven sub-model.

For example, the electronic device may specifically predict, by using the data driving sub-model, a first predicted chiller power, a first predicted chilled water pump power and a first predicted cooling water pump power of the cold source system in a next period, that is, predict that the first predicted chiller power in the next period is P'_chAnd the first predicted chilled water pump power is P'_chpAnd the first predicted cooling water pump power is P'_cp。

For example, the system state of the t period is S_tThe control amount is a_t. Considering the multi-stage influence of the control quantity, taking the system states of the current and previous periods and the adopted control quantity as input, outputting the system state of the next period, namely:

S_t+1＝f(S_t-L+1，a_t-L+1，...，S_t，a_t)

where L is the considered window length (i.e. the preset time period), i.e. the length of the cycle that may affect the next state quantity, which value may be determined according to the specific characteristics of the system.

204. And the electronic equipment predicts a second predicted state quantity of the cold source system in the next period by utilizing the mechanism energy consumption submodel.

For example, the mechanism energy consumption submodel includes a chiller energy consumption module, a chilled water pump energy consumption module and a cooling water pump energy consumption module, and the electronic device may specifically predict a second predicted chiller power of the cold source system in a next period by using the chiller energy consumption module; predicting a second predicted chilled water pump power of the cold source system in the next time period by using a chilled water pump energy consumption module; and predicting a second predicted cooling water pump power of the cold source system in the next time period by using a cooling water pump energy consumption module.

205. And the electronic equipment fuses the first prediction state quantity and the second prediction state quantity to obtain the fused prediction state quantity.

For example, the electronic device may specifically determine, based on the prediction errors of the data driving submodel and the mechanism energy consumption submodel, a weight of the first predicted chiller power, a weight of the first predicted chilled water pump power, and a weight of the first predicted cooling water pump power, and a weight of the second predicted chiller power, a weight of the second predicted chilled water pump power, and a weight of the second predicted cooling water pump power; fusing the first predicted water chilling unit power and the second predicted water chilling unit power based on the weight of the first predicted water chilling unit power and the weight of the second predicted water chilling unit power to obtain fused predicted water chilling unit power; fusing the first predicted chilled water pump power and the second predicted chilled water pump power based on the weight of the first predicted chilled water pump power and the weight of the second predicted chilled water pump power to obtain fused predicted chilled water pump power; and fusing the first predicted cooling water pump power and the second predicted cooling water pump power based on the weight of the first predicted cooling water pump power and the weight of the second predicted cooling water pump power to obtain the fused predicted cooling water pump power. In order to improve the accuracy of the fusion, the sum of the weight of the first predicted chiller power and the weight of the second predicted chiller power may be 1, the sum of the weight of the first predicted chilled water pump power and the weight of the second predicted chilled water pump power may be 1, and the sum of the weight of the first predicted chilled water pump power and the weight of the second predicted chilled water pump power may be 1.

206. And the electronic equipment carries out reward calculation on the fused predicted state quantity according to a preset reward function to obtain a reward value of the cold source system in a target time period.

For example, the electronic device may specifically determine a reward weight of the post-fusion predicted state quantity; and calculating the reward value of the cold source system in the target time period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the reward weight.

207. And the electronic equipment performs punishment calculation on the fused predicted state quantity based on a preset constraint condition to obtain a punishment value of the cold source system in a target time interval.

For example, the electronic device may specifically obtain a load of the data center device, a cooling capacity coefficient of the cold source system, and a cooling capacity of the cold source system at a target time period; and calculating the punishment value of the cold source system in the target time period based on the equipment load, the refrigerating capacity coefficient and the refrigerating capacity of the data center.

208. And the electronic equipment calculates the income value of the cold source system in a target time period based on the reward value and the penalty value, and determines the total income value of the cold source system in a preset time period.

For example, the electronic device may specifically combine the reward function and the penalty function, and may obtain the profit value R in the next period_t-r_t. Then, updating a target control strategy of the preset control model based on the income value to obtain an updated control strategy; taking the next time interval as the current time interval (i.e. t ═ t +1), and taking the updated control strategy as the preset control modelThe target control strategy continues to predict the next time period of the cold source system (i.e. the step 202 is executed again) until the profit value of each time period within the preset time period is obtained (i.e. until T is T); the total benefit value of the cold source system for the preset time period is determined based on the benefit value of each time period within the preset time period, and then step 209 is performed.

209. And when the reinforcement learning does not reach the preset learning turn number, the electronic equipment adjusts the target control strategy, and returns the adjusted control strategy as the target control strategy to the execution step 202.

For example, when the reinforcement learning does not reach the preset number of learning rounds, the electronic device adjusts the target control strategy according to the total income to obtain an adjusted control strategy, and continues to predict the predicted state quantities of the cold source system in multiple dimensions in the next period by using the adjusted control strategy as the target control strategy.

For example, when the reinforcement learning does not reach the preset number of learning rounds, the electronic device may specifically adjust the target control strategy according to the total profit (i.e., adjust parameters of a neural network or an equation used for estimating the state value function), obtain an adjusted control strategy, continue the reinforcement learning with the adjusted control strategy as the target control strategy (i.e., return to perform step 202, make M equal to M +1) until the preset number of learning rounds is reached (i.e., until M equal to M), and then perform step 209.

210. And when the reinforcement learning reaches the preset learning turn number, the electronic equipment outputs a post-training control model and controls the cold source system based on the post-training control model.

For example, the electronic device may specifically obtain the post-training control model when the reinforcement learning reaches a preset number of learning rounds (i.e., when M is equal to M). Then, acquiring the current state quantity of the cold source system, wherein the current state quantity comprises the current time period and the state quantity of a preset time period before the current time period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy. For example, the trained control model may be used to determine a system control strategy of the cold source system, determine a system control amount currently adopted by the cold source system based on the current state amount and the system control strategy, and control the cold source system based on the system control amount. The specific process flow can be as shown in fig. 2 b.

Through the verification of a numerical experiment, the energy saving proportion of a cold source system of the data center is about 1% -5%, and the direction of the control strategy adjustment summarized according to the learned strategy accords with the expert experience.

In order to better implement the method, correspondingly, an embodiment of the present application further provides an energy saving control device for a cold source system, where the energy saving control device for the cold source system may be specifically integrated in an electronic device, and the electronic device may be a server or a terminal.

For example, as shown in fig. 3, the energy saving control apparatus of the cold source system may include an obtaining unit 301, a predicting unit 302, a calculating unit 303, a determining unit 304, an adjusting unit 305, and a controlling unit 306, as follows:

an obtaining unit 301, configured to obtain a current state quantity of a cold source system and a target control policy of a preset control model, where the current state quantity includes a current time period and a state quantity of a preset time period before the current time period;

the prediction unit 302 is configured to predict predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fuse the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity;

the calculating unit 303 is configured to perform benefit calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition, so as to obtain a benefit value of the cold source system in a target time period;

a determining unit 304, configured to determine, by using a preset control model, a total benefit value of the cold source system in a preset time period based on the benefit value, where the preset time period includes at least one target time period;

the adjusting unit 305 is configured to, when the total benefit value does not satisfy a preset condition, adjust the target control strategy according to the total benefit to obtain an adjusted control strategy, and continue predicting the predicted state quantities of the cold source system in multiple dimensions in a target time period by using the adjusted control strategy as the target control strategy;

the control unit 306 is configured to output a post-training control model for controlling the cold source system when the total benefit value meets a preset condition.

Optionally, in some embodiments, the prediction unit 302 includes a determination subunit, a first prediction subunit, a second prediction subunit, and a fusion subunit, as follows:

Optionally, in some embodiments, the calculating unit 303 includes a first calculating subunit, a second calculating subunit, and a third calculating subunit, as follows:

Optionally, in some embodiments, the target time interval is a time interval next to a current time interval, and the determining unit 304 may be specifically configured to update the target control strategy of the preset control model based on the profit value to obtain an updated control strategy; taking the target time interval as the current time interval, taking the updated control strategy as the target control strategy of the preset control model, and continuing to predict the next time interval of the cold source system until the income value of each time interval in the preset time interval is obtained; and determining the total benefit value of the cold source system in the preset time period based on the benefit value of each time period in the preset time period.

Optionally, in some embodiments, the control unit 306 may be specifically configured to obtain a current state quantity of the cold source system, where the current state quantity includes a current time period and a state quantity of a preset time period before the current time period; determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity; and controlling the cold source system according to the system control strategy.

In a specific implementation, the above units may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and the specific implementation of the above units may refer to the foregoing method embodiments, which are not described herein again.

As can be seen from the above, in the present embodiment, the obtaining unit 301 obtains the current state quantity of the cold source system and the target control strategy of the preset control model, where the current state quantity includes the current time period and the state quantity of the preset time period before the current time period; then, the prediction unit 302 predicts the predicted state quantities of multiple dimensions of the cold source system in the target time period according to the current state quantity and the target control strategy, and fuses the predicted state quantities of multiple dimensions to obtain the fused predicted state quantity; then, the calculation unit 303 performs revenue calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain a revenue value of the cold source system in a target time period; determining, by the determining unit 304, a total benefit value of the cold source system in a preset time period based on the benefit value by using a preset control model, where the preset time period includes at least one target time period; when the total profit value does not satisfy the preset condition, the adjusting unit 305 adjusts the target control strategy according to the total profit to obtain an adjusted control strategy, and the adjusted control strategy is used as the target control strategy to continuously predict the predicted state quantities of the cold source system in multiple dimensions in the target time interval; when the total benefit value satisfies a preset condition, the control unit 306 outputs a trained control model for controlling the cold source system. According to the scheme, the sensor monitoring data accumulated by the data center can be utilized to learn and construct a simulation model of the cold source system, then based on the simulation model, the operation constraint, the energy consumption optimization target and the like of the system are considered, the reward function design is carried out, the reinforcement learning algorithm is utilized to optimize the control strategy, and finally the energy conservation of the cold source system is realized. According to the scheme, modeling and optimization can be performed on the data center cold source system, and more energy-saving cold source system control parameters can be obtained under the constraint of proper refrigerating capacity according to the fluctuation of equipment load in a data center machine room, so that the reduction of energy consumption is realized. And the data driving model and the mechanism driving model are combined, so that the precision and the interpretability are improved. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the operation of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously updated in an iterative manner, so that the real-time dynamic adjustment can be realized, and the system has good adaptability.

In addition, an electronic device according to an embodiment of the present application is further provided, as shown in fig. 4, which shows a schematic structural diagram of the electronic device according to an embodiment of the present application, and specifically:

the electronic device may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 4 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

the processor 401 is a control center of the electronic device, connects various parts of the whole electronic device by various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the electronic device. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.

The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.

The electronic device further comprises a power supply 403 for supplying power to the various components, and preferably, the power supply 403 is logically connected to the processor 401 through a power management system, so that functions of managing charging, discharging, and power consumption are realized through the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The electronic device may further include an input unit 404, and the input unit 404 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the electronic device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the electronic device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application program stored in the memory 402, thereby implementing various functions as follows:

acquiring a current state quantity of a cold source system and a target control strategy of a preset control model, wherein the current state quantity comprises a current time interval and a state quantity of a preset time interval before the current time interval; then, predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity; then, performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period; determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy; and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system.

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

As can be seen from the above, the present embodiment may obtain the current state quantity of the cold source system and the target control strategy of the preset control model, where the current state quantity includes the state quantity of the current time period and the preset time period before the current time period; then, predicting the predicted state quantities of multiple dimensions of the cold source system in a target time period according to the current state quantity and a target control strategy, and fusing the predicted state quantities of the multiple dimensions to obtain a fused predicted state quantity; then, performing income calculation on the fused predicted state quantity according to a preset reward function and a preset constraint condition to obtain an income value of the cold source system in a target time period; determining a total profit value of the cold source system in a preset time period based on the profit value by adopting a preset control model, wherein the preset time period comprises at least one target time period; when the total profit value does not meet the preset condition, adjusting the target control strategy according to the total profit to obtain an adjusted control strategy, and continuously predicting the multi-dimensional prediction state quantity of the cold source system in the target time interval by taking the adjusted control strategy as the target control strategy; and when the total benefit value meets a preset condition, outputting a trained control model for controlling the cold source system. According to the scheme, the sensor monitoring data accumulated by the data center can be utilized to learn and construct a simulation model of the cold source system, then based on the simulation model, the operation constraint, the energy consumption optimization target and the like of the system are considered, the reward function design is carried out, the reinforcement learning algorithm is utilized to optimize the control strategy, and finally the energy conservation of the cold source system is realized. According to the scheme, modeling and optimization can be performed on the data center cold source system, and more energy-saving cold source system control parameters can be obtained under the constraint of proper refrigerating capacity according to the fluctuation of equipment load in a data center machine room, so that the reduction of energy consumption is realized. And the data driving model and the mechanism driving model are combined, so that the precision and the interpretability are improved. Based on the trained control model and strategy, the corresponding control quantity can be directly output according to the working condition of the cold source system, and efficient control strategy optimization is realized. With the operation of the system, the working condition characteristics of the system can be changed, and the model and the strategy can be continuously updated in an iterative manner, so that the real-time dynamic adjustment can be realized, and the system has good adaptability.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, embodiments of the present application further provide a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to perform the steps in any one of the energy saving control methods for a cooling source system provided in the embodiments of the present application. For example, the instructions may perform the steps of:

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the instructions stored in the storage medium can execute the steps in the energy saving control method for any kind of cold source system provided in the embodiments of the present application, the beneficial effects that can be achieved by the energy saving control method for any kind of cold source system provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.

The energy-saving control method, the energy-saving control device, the electronic device and the storage medium of the cold source system provided by the embodiment of the application are introduced in detail, a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. An energy-saving control method of a cold source system is characterized by comprising the following steps:

2. The method of claim 1, wherein the preset control models comprise a data-driven sub-model and a mechanical energy consumption sub-model, and the predicting the predicted state quantities of the cold source system in multiple dimensions for the target time period according to the current state quantities and the target control strategy comprises:

determining a target control quantity adopted currently according to the current state quantity and a target control strategy;

predicting a first prediction state quantity of the cold source system in a target period by using a data driving submodel, wherein the data driving submodel is a data model obtained based on historical data training of the cold source system, and the historical data comprises the historical state quantity and the historical control quantity of the cold source system;

predicting a second predicted state quantity of the cold source system in a target period by utilizing a mechanism energy consumption submodel, wherein the mechanism energy consumption submodel is a physical model established based on reference energy consumption of internal equipment of the cold source system;

the fusing the predicted state quantities of the multiple dimensions to obtain the fused predicted state quantity comprises the following steps: and fusing the first prediction state quantity and the second prediction state quantity to obtain a fused prediction state quantity.

3. The method of claim 2, wherein the mechanism energy consumption submodel comprises a chiller energy consumption module, a chilled water pump energy consumption module, and a cooling water pump energy consumption module, wherein the second predicted state quantity comprises a second predicted chiller power, a second predicted chilled water pump power, and a second predicted cooling water pump power, and wherein the predicting the second predicted state quantity of the cold source system during the target time period using the mechanism energy consumption submodel comprises:

predicting the second predicted water chilling unit power of the cold source system in the target time period by using a water chilling unit energy consumption module;

predicting a second predicted chilled water pump power of the cold source system in a target time period by using a chilled water pump energy consumption module;

and predicting second predicted cooling water pump power of the cold source system in a target time period by using a cooling water pump energy consumption module.

4. The method of claim 3, wherein the current state quantities comprise a target chilled water outlet temperature and a target chilled water return temperature, the target control quantities comprise a target chilled water outlet temperature and a target chilled water flow, and the predicting, with the chiller energy consumption module, a second predicted chiller power of the cold source system for a target time period comprises:

acquiring cold water model parameters of the energy consumption module of the water chilling unit;

and calculating the second predicted water chilling unit power of the cold source system in a target time period based on the target cooling water outlet water temperature, the target chilled water return water temperature, the target chilled water outlet water temperature, the target chilled water flow and the cold water model parameters.

5. The method of claim 3, wherein the target control amount comprises a target chilled water pump flow rate, and wherein predicting, with the chilled water pump energy consumption module, a second predicted chilled water pump power for the cold source system over a target period of time comprises:

obtaining a refrigeration model parameter of the refrigeration water pump energy consumption module;

and calculating a second predicted chilled water pump power of the cold source system in a target time period based on the target chilled water pump flow and the freezing model parameters.

6. The method of claim 3, wherein the target control amount comprises a target cooling water pump flow rate, and wherein predicting, using a cooling water pump energy consumption module, a second predicted cooling water pump power of the cold source system over a target time period comprises:

obtaining cooling model parameters of the cooling water pump energy consumption module;

and calculating second predicted cooling water pump power of the cold source system in a target time period based on the target cooling water pump flow and the cooling model parameters.

7. The method according to claim 2, wherein the fusing the first predicted state quantity and the second predicted state quantity to obtain a fused predicted state quantity comprises:

determining a first weight of the first prediction state quantity based on the prediction errors of the data driving submodel and the mechanism energy consumption submodel, and determining a second weight of the second prediction state quantity based on the prediction errors of the data driving submodel and the mechanism energy consumption submodel;

and fusing the first prediction state quantity and the second prediction state quantity based on the first weight and the second weight to obtain a fused prediction state quantity.

8. The method according to any one of claims 1 to 7, wherein the performing revenue calculation on the fused predicted state quantities according to a preset reward function and a preset constraint condition to obtain a revenue value of the cold source system in a target time period comprises:

performing reward calculation on the fused predicted state quantity according to a preset reward function to obtain a reward value of the cold source system in a target time period;

based on a preset constraint condition, carrying out punishment calculation on the fused predicted state quantity to obtain a punishment value of the cold source system in a target time interval;

and calculating the benefit value of the cold source system in a target time period based on the reward value and the penalty value.

9. The method as claimed in claim 8, wherein the post-fusion predicted state quantity includes post-fusion predicted chiller power, post-fusion predicted chilled water pump power and post-fusion predicted cooling water pump power, and the rewarding calculation of the post-fusion predicted state quantity according to a preset rewarding function to obtain a rewarding value of the cold source system in a target time period includes:

determining reward weight of the fused predicted state quantity;

and calculating the reward value of the cold source system in the target time period based on the fused predicted water chilling unit power, the fused predicted chilled water pump power, the fused predicted cooling water pump power and the reward weight.

10. The method according to claim 8, wherein the performing penalty calculation on the fused predicted state quantity based on the preset constraint condition to obtain a penalty value of the cold source system in a target time period comprises:

acquiring the equipment load of a data center, the refrigerating capacity coefficient of a cold source system and the refrigerating capacity of the cold source system in a target time period;

and calculating the punishment value of the cold source system in the target time period based on the equipment load, the refrigerating capacity coefficient and the refrigerating capacity of the data center.

11. The method as claimed in any one of claims 1 to 7, wherein the target time interval is a time interval next to a current time interval, and the determining, by using a preset control model, a total benefit value of the heat sink system for a preset time interval based on the benefit value comprises:

updating a target control strategy of the preset control model based on the income value to obtain an updated control strategy;

taking the target time interval as the current time interval, taking the updated control strategy as the target control strategy of the preset control model, and continuing to predict the next time interval of the cold source system until the income value of each time interval in the preset time interval is obtained;

and determining the total benefit value of the cold source system in the preset time period based on the benefit value of each time period in the preset time period.

12. The method according to any one of claims 1 to 7, wherein after outputting the post-training control model when the total gain value satisfies a preset condition, the method further comprises:

acquiring the current state quantity of a cold source system, wherein the current state quantity comprises the current time period and the state quantity of a preset time period before the current time period;

determining a system control strategy of the cold source system by utilizing the trained control model based on the current state quantity;

and controlling the cold source system according to the system control strategy.

13. An energy-saving control device of a cold source system is characterized by comprising:

14. A computer readable storage medium storing a plurality of instructions, the instructions being suitable for being loaded by a processor to perform the steps of the energy saving control method of a heat sink system according to any one of claims 1 to 12.

15. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method according to any of claims 1 to 12 are implemented when the program is executed by the processor.