WO2022111232A1

WO2022111232A1 - Method for optimizing control model of water cooling system, electronic device, and storage medium

Info

Publication number: WO2022111232A1
Application number: PCT/CN2021/127980
Authority: WO
Inventors: 弄庆鹏; 周祥生; 屠要峰; 李忠良; 王壮; 高洪
Original assignee: 中兴通讯股份有限公司
Priority date: 2020-11-30
Filing date: 2021-11-01
Publication date: 2022-06-02
Also published as: CN114580688A

Abstract

Embodiments of the present application relate to the technical field of data processing, and provide a method for optimizing a control model of a water cooling system, an electronic device, and a storage medium. The method for optimizing a control model of a water cooling system comprises: obtaining parameters of a data center; creating a state transition model and a control model; optimizing the state transition model according to the parameters of the data center; and optimizing the control model according to the optimized state transition model and the parameters of the data center, and obtaining the optimized control model.

Description

Control model optimization method, electronic device and storage medium for water cooling system

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on the Chinese patent application with the application number "202011377439.8" and the filing date is November 30, 2020, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference Application.

technical field

The embodiments of the present application relate to the technical field of data processing, and in particular, to a method for optimizing a control model of a water cooling system, an electronic device, and a storage medium.

Background technique

The power consumption of data centers continues to increase with the expansion of the scale. Among them, the water cooling system accounts for about half of the non-Internet technology (IT) energy consumption of data centers. Therefore, effectively reducing the energy consumption of the water cooling system is one of the keys to reducing non-IT energy consumption in data centers. Usually, the data center will optimize the control strategy model of the water cooling system by directly interacting with the physical environment of the water cooling system or by interacting with the simulated environment to obtain the control strategy and its control effect data, so as to send control commands to the water cooling system that can reduce energy consumption.

However, on the one hand, when the model is used to directly interact with the physical environment of the water cooling system, the model is generally not optimized, and the control strategy generated by an unoptimized model is likely to affect the normal operation of the data center; on the other hand, the simulation environment is used to interact When the control strategy and its control effect data are obtained to optimize the control strategy model of the water cooling system, the simulation environment usually deviates from the real data center environment and the real environment of the water cooling system, and the real environment parameters cannot be fitted.

SUMMARY OF THE INVENTION

An embodiment of the present application provides a method for optimizing a control model of a water cooling system, and the method includes the following steps: acquiring parameters of a data center; creating a state transition model and a control model; optimizing the state transition model according to the parameters of the data center; The optimized state transition model and the parameters of the data center are optimized for the control model, and the optimized control model is obtained.

An embodiment of the present application further provides an electronic device, the device includes: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores data that can be processed by the at least one processor. The instructions are executed by the processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the above-mentioned control model optimization method for a water cooling system.

An embodiment of the present application further provides a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the above-mentioned control model optimization method for a water cooling system is implemented.

Description of drawings

One or more embodiments are exemplified by the pictures in the corresponding drawings, and these exemplified descriptions do not constitute limitations on the embodiments.

1 is a flowchart of a control model optimization method for a water cooling system provided by a first embodiment of the present application;

FIG. 2 is a flowchart of step 103 in the control model optimization method of the water cooling system provided by the first embodiment of the present application shown in FIG. 1;

3 is a flowchart of a control model optimization method for a water cooling system provided by a second embodiment of the present application;

FIG. 4 is a flowchart of step 305 in the control model optimization method of the water cooling system provided by the second embodiment of the present application shown in FIG. 3;

5 is a flowchart of a control model optimization method for a water cooling system provided by a third embodiment of the present application;

FIG. 6 is a schematic structural diagram of an electronic device provided by a fourth embodiment of the present application.

Detailed ways

The main purpose of the embodiments of the present application is to propose a control model optimization method, electronic device and storage medium for a water cooling system, aiming to provide an optimized control model for the water cooling system of a data center through a real interactive environment interaction process, so that It can fit the real environment and effectively reduce the energy consumption of the water cooling system without affecting the normal operation of the data center.

In order to make the objectives, technical solutions and advantages of the embodiments of the present application more clear, each embodiment of the present application will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art can understand that, in each embodiment of the present application, many technical details are provided for the reader to better understand the present application. However, even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in the present application can be realized. The following divisions of the various embodiments are for the convenience of description, and should not constitute any limitation on the specific implementation of the present application, and the various embodiments may be combined with each other and referred to each other on the premise of not contradicting each other.

The first embodiment of the present application relates to a control model optimization method of a water cooling system, as shown in FIG. 1 , which specifically includes:

In step 101, parameters of the data center are acquired.

Specifically, the data packet sent by the data center is received, the data packet is parsed, and the parameter set of the data center is obtained. The data center includes the data center water cooling system, the data center computer room and the data center environment. The data center water cooling system can provide control parameters and output parameters in the data, the data center computer room can provide the data center operating parameters, and the data center environment can provide the data center environment parameters.

More specifically, the control parameters also include the operating number of cooling towers, the operating number of cooling pumps, the operating number of refrigeration pumps, the operating number of plate replacements, the temperature of plate replacement water, the water temperature of cooling towers, the operating frequency of cooling tower fans, the operating frequency of cooling pumps, Parameters such as the operating frequency of the refrigeration pump, the setting value of the pressure difference of the refrigeration main pipe; the output parameters also include parameters such as the energy consumption of the water cooling system, the air temperature at the end of the water cooling system, the return air temperature at the end of the water cooling system, and the cooling capacity of the water cooling system; the operating parameters also include the IT energy parameters such as power consumption parameters and data center temperature settings; environmental parameters also include parameters such as outdoor dry bulb temperature, outdoor wet bulb temperature, and outdoor humidity.

Of course, the above is only a specific example, and the parameters of the data center may also include other parameters in the actual use process, which will not be repeated here.

Step 102, creating a state transition model and a control model.

Specifically, the state transition model is a model describing the state change relationship of the data center, which can obtain the current output parameters of the data center according to the input environmental parameters of the data center, the operating parameters of the data center and the control parameters of the data center, and predict The environmental parameters of the data center and the operation parameters of the data center in the next state after the output parameter is acted on are obtained. The control model is the control model of the water cooling system. Inputting the environmental parameters of the data center and the operating parameters of the data center can obtain the control strategy of the data center, wherein the control strategy is expressed in the form of control parameters.

Step 103 , optimize the state transition model according to the parameters of the data center.

Specifically, the environmental parameters, operating parameters, control parameters, etc. of the parameters that can have an effect on the control effect in the data center are used as the input data of the model, and the output parameters that can reflect the control effect are used as the output of the model corresponding to the input data. data, and train the model to achieve model optimization.

More specifically, as shown in Figure 2, step 103 includes:

Step 201: Input the environmental parameters, operating parameters and control parameters into the state transition model to obtain the predicted output parameters.

Step 202: Calculate the state transition loss according to the predicted output parameter and the output parameter.

This embodiment does not limit the loss function used for calculating the state transition. In actual use, the loss function may be any loss function that can reflect the deviation between the predicted output parameter and the actual output parameter.

Step 203, optimize the state transition model according to the state transition loss.

Specifically, the state transition loss obtained in step 202 is used as training data to train the state transition model, so as to realize the optimization of the model.

Step 104 , optimize the control model according to the optimized state transition model and parameters of the data center, and obtain the optimized control model.

Specifically, in this embodiment, the training data is directly or indirectly obtained from the parameters of the state transition model, the control model and the data center, and then the control model is trained according to the obtained training data to realize the optimization of the control model.

The control model optimization method for a water cooling system proposed in this embodiment provides training data for the created state transition model and control model by acquiring the parameters of the data center, and then trains and optimizes the state transition model according to the parameters of the data center to provide the optimal control model. The real, offline environment is used for interactive training, and then the control model is optimized according to the optimized state transition model and the parameters of the data center to realize the optimization of the control model of the water cooling system. First, the present application provides a control model for the water cooling system of the data center, which can effectively reduce the energy consumption of the water cooling system. Second, the present application can also optimize the control model instead of using the initial control model, which is safer and more reliable. Provide an optimized control model for the water cooling system of the data center through the interaction process of the real interactive environment, so that it can fit the real environment and effectively reduce the energy consumption of the water cooling system without affecting the normal operation of the data center.

The second embodiment of the present application relates to a control model optimization method for a water cooling system. This embodiment is roughly the same as the first embodiment, except that step 104 uses reinforcement learning to optimize the control model. The specific process is shown in Figure 3 shown:

In step 301, parameters of the data center are acquired.

Specifically, step 301 in this embodiment is substantially the same as step 101 in the first embodiment, and details are not repeated here.

Step 302, creating a state transition model and a control model.

Specifically, step 302 in this embodiment is substantially the same as step 102 in the first embodiment, and details are not repeated here.

Step 303: Optimize the state transition model according to the parameters of the data center.

Specifically, step 303 in this embodiment is substantially the same as step 103 in the first embodiment, and details are not repeated here.

Step 304, randomly select a set of sample data from the parameters of the data center.

Specifically, various types of parameters in a certain state of the data center are randomly selected, and only data in the same state can constitute a set of sample data.

Step 305, using the control model and the state transition model to process the sample data to obtain training samples.

Specifically, as shown in Figure 4, step 305 includes:

Step 401, generating state representation data according to the environmental parameters and the operating parameters.

Specifically, the data is operated in the form of a matrix, which can effectively improve the running rate and the execution efficiency of the steps. Therefore, the data can be vectorized to further obtain the state representation vector of the data center.

In step 402, the state representation data is input into the control model to obtain predictive control parameters.

Specifically, the state representation vector is input into the control model, and the control model outputs a vector of control parameters predicted by the model.

Step 403: Input the predicted control parameters, the environmental parameters and the operating parameters into the state transition model, obtain the predicted output parameters, and update the environmental parameters and the operating parameters according to the predicted output parameters.

Specifically, after the predictive control parameters, environmental parameters and operating parameters are input into the state transition model, the predicted output parameters output by the model can be obtained, and the environmental parameters and operating parameters in the next state can also be obtained. The environment parameters and running parameters are updated to the set environment parameters and running parameters for the next iteration cycle.

Step 404: Update the state representation data according to the updated environment parameters and operating parameters.

Step 405: Evaluate the predictive control parameters according to the predictive control parameters, the predictive output parameters and the control parameters, and obtain a control reward.

Specifically, first obtain the control action evaluation value of the control strategy according to the predicted control parameters and the control parameters, and then obtain the control effect evaluation value according to the predicted output parameter, wherein the control effect evaluation value includes the energy consumption evaluation value and the cooling capacity evaluation value, Finally, comprehensively analyze the control action evaluation value, energy consumption evaluation value and cooling capacity evaluation value to obtain control rewards.

More specifically, according to the current predicted control parameters and the control parameters of the data center, it is substituted into the preset function to obtain the control action evaluation value; according to the first predicted output parameters, the initial energy consumption parameter and the initial cooling capacity parameter are obtained, Obtain the current predicted energy consumption parameters and the current predicted cooling capacity parameters according to the current predicted output parameters, obtain the historical predicted energy consumption parameters according to the last predicted output parameters, and obtain the energy consumption according to the current predicted energy consumption parameters and the historical predicted energy consumption parameters Obtain the deviation degree of energy consumption according to the current predicted energy consumption parameters and initial energy consumption parameters, average the growth rate and deviation to obtain the energy consumption evaluation value, and analyze according to the initial cooling capacity parameters and the current predicted cooling capacity parameters, Obtain the cooling capacity evaluation value; perform the weighted average of the obtained control action evaluation value and the cooling capacity evaluation value to obtain the constraint evaluation value, and perform the weighted average of the energy consumption evaluation value and the constraint evaluation value to obtain the control reward.

It should be noted that the function used in this embodiment is not limited, and may be any function that can obtain an intuitive and accurate evaluation result according to the above data.

Step 406: Generate a training sample according to the state representation data before the update, the predicted control parameters, the control reward and the updated state representation data.

Specifically, a training sample is a quadruple, and this quadruple consists of the state representation data before the update obtained in step 401 , the predicted control parameters obtained in step 402 , the control reward obtained in step 405 , and the update obtained in step 44 The post state characterizes the data composition.

Step 407, detecting whether the number of training samples reaches a second threshold.

Specifically, if yes, go to step 408, if not, go to step 402.

Step 408, taking all the training samples as a group of training samples.

Step 306, optimize the control model according to the training samples.

Specifically, a set of training samples obtained in step 408 is used as a training set to train the control model to complete an optimization.

Step 307: Detect whether the optimization times of the control model reach a first threshold.

Specifically, if yes, go to step 308, if not, go to step 304.

It should be noted that the first threshold and the second threshold are only to distinguish between the threshold of the number of optimizations and the threshold of the number of training samples. There is no connection between the first threshold and the second threshold, and they are two values set according to requirements.

Step 308: Obtain and save the optimized control model.

Compared with the prior art, on the basis of the first embodiment, this embodiment can add the evaluation of cooling capacity, so as to ensure that the temperature control of the water cooling system meets the temperature control requirements of the data center, so as to ensure the normal operation of the water cooling system It can meet the temperature control requirements of the data center and reduce the energy consumption of the water cooling system.

In order to enable those skilled in the art to more clearly understand the overall flow of the control model optimization method for the water cooling system disclosed in the first and second embodiments of the present application, the third embodiment of the present application applies the control model optimization method for the water cooling system Take the water cooling system of the data center described in the table below as an example.

As shown in FIG. 5 , the control model optimization method of the water cooling system provided by the third embodiment of the present application includes:

Step 501: Obtain and analyze the offline data packet of data center historical collection uploaded by the data center to obtain a data sample.

Specifically, a plurality of data samples are obtained through analysis, and each data sample includes the output state parameters of the water cooling system, the environmental state parameters of the data center, the control parameters of the water cooling system, and the operating state parameters of the data center.

More specifically, a data sample includes: the output state parameters of the water cooling system include the energy consumption parameter of the water cooling system (CoolingEnergy), the current cooling capacity parameter of the water cooling system (Cooling_cal), the air temperature parameter at the end of the water cooling system (AirOutAvgTemp), the end return of the water cooling system. Air temperature parameter (AirInAvgTemp);

Data center environmental status parameters include outdoor dry bulb temperature (OutsideDBTemp1), outdoor relative humidity (OutsideRHumidity1), and outdoor wet bulb temperature (OutsideWetTemp1);

Data center operating status parameters include IT energy consumption (ITEnergy) and data center temperature setting (DCRoomTempSet);

The control parameters of the water cooling system include the cooling tower running number (CTNum), the freezing pump running number (CHWPNum), the cooling pump running number (CWPNum), the plate replacement running number (HENum), the plate replacement water temperature setting value (HECHWSTempSet), the cooling tower water supply Temperature setting value (CTWSTempSet), refrigeration header 1 differential pressure setting value (CH1_CWSPress), refrigeration header 2 differential pressure setting value (CH2_CWSPress), cooling water pump 1 operating frequency setting value (CWP1_Frequency), cooling water pump 2 operating frequency setting value (CWP2_Frequency) , cooling tower 1 fan 1 operating frequency setting value (CT1_FAN1_Frequency), cooling tower 1 fan 2 operating frequency setting value (CT1_FAN2_Frequency), cooling tower 2 fan 1 operating frequency setting value (CT2_FAN1_Frequency), cooling tower 2 fan 2 operating frequency setting value ( CT2_FAN2_Frequency), chilled water pump 1 operating frequency setting value (CHWP1_Frequency), chilled water pump 2 operating frequency setting value (CHWP2_Frequency).

It should be noted that, before executing the next step, it is also necessary to create a data center water cooling system output state transition model M1 and a water cooling system control parameter exploration optimization model M2, and initialize the model M1 and the model M2. After the model is created, it also includes setting the maximum training times of the model M2 and the maximum exploration times of the water-cooling system control parameters of the randomly sampled samples. Specifically, the maximum training times of the model M2 is equivalent to the first threshold in the second embodiment. The maximum number of searches for the water-cooling system control parameters of the sampled samples corresponds to the second threshold in the second embodiment.

In step 502, the output state transition model M1 is trained by using the parameters of the data center, and the output state transition model M3 of the water cooling system of the data center after the training is optimized is obtained.

Specifically, the data center environmental state parameters, data center operating state parameters, and water cooling system control parameters are vectorized to generate a data center hybrid state representation through fusion operations, and the data center hybrid state representation is used as the input feature of the model M1, while the water cooling The system output state parameters are vectorized as the output features of the model M1, and the model M1 is trained. After the model optimization meets the setting requirements, the training is stopped, and the model M3 is obtained and saved.

Step 503: Randomly sample the parsed data samples to obtain a data sample.

Step 504 , obtain the data center state representation vector S and the water cooling system reference control parameter vector A_base according to the sample data.

Specifically, vectorize the data center environmental parameters and data center operating state parameters in the data sample, obtain the data center state representation vector S, vectorize the water cooling system control parameters in the data sample, and obtain the water cooling system of the data sample. Baseline control parameter vector A_base.

Step 505: Input the vector S and the vector A_base into the model M3, and obtain the corresponding reference parameter param_base of the output state of the water cooling system.

Specifically, the vector S and the vector A_base are fused and calculated to generate a data center mixed state representation vector, and then the data center mixed state representation vector is input into the model M3 to obtain the parameter param_base output by the model.

It should be noted that, in addition to being used as a reference value, the parameter param_base actually outputs the state parameter param_hist of the first and previous water cooling system of each sample. Specifically, the value of the parameter param_base is assigned to the parameter param_hist.

It should be noted that, at this time, it is also necessary to clear explore_count=0 for the current exploration times of the water cooling system control parameters of the data sample.

Step 506: Input the vector S into the model M2 to obtain the water-cooling system control exploratory parameter vector A_explore.

Specifically, the vector A_explore is the control parameter corresponding to the next state obtained after one exploration, including at least the number of cooling tower operations (CTNum), the number of refrigeration pump operations (CHWPNum), the number of cooling pump operations (CWPNum), and the number of plate change operations. (HENum), plate replacement water temperature setting value (HECHWSTempSet), cooling tower water supply temperature setting value (CTWSTempSet), refrigeration header 1 differential pressure setting value (CH1_CWSPress), refrigeration header 2 differential pressure setting value (CH2_CWSPress), cooling water pump 1 running Frequency setting value (CWP1_Frequency), cooling water pump 2 operating frequency setting value (CWP2_Frequency), cooling tower 1 fan 1 operating frequency setting value (CT1_FAN1_Frequency), cooling tower 1 fan 2 operating frequency setting value (CT1_FAN2_Frequency), cooling tower 2 fan 1 operating Frequency setting value (CT2_FAN1_Frequency), cooling tower 2 fan 2 operating frequency setting value (CT2_FAN2_Frequency), chilled water pump 1 operating frequency setting value (CHWP1_Frequency), chilled water pump 2 operating frequency setting value (CHWP2_Frequency).

Step 507: Obtain the current water cooling system output state parameter param_explore according to the vector S and the vector A_explore, further obtain the energy consumption evaluation value and the cooling capacity evaluation value, and obtain the control action evaluation value according to the parameter A_base and the parameter A_explore.

Specifically, the vector S and vector A_explore are input into the water cooling system output state transition model M3 to obtain the current water cooling system output state parameter param_explore; according to the parameter param_base, parameter param_explore and parameter param_hist, respectively obtain the energy consumption evaluation value and cooling capacity evaluation value.

More specifically, the vector S and A_explore are combined and calculated to generate the data center mixed state representation vector S_mix, and then the vector S_mix is input into the model M3 to obtain the parameter param_explore output by the model, and then obtained according to the parameter param_base, parameter param_explore and parameter param_hist The energy consumption evaluation method is as follows:

Obtain the water cooling system energy consumption parameter CE _Φ from the water cooling system output state reference parameter param_base, obtain the water cooling system energy consumption parameter CE _t from the current water cooling system output state parameter param_explore, and obtain the water cooling system energy consumption parameter CE from the previous water cooling system output state parameter param_hist _t-1 , and calculate the energy consumption reward value of the current water cooling system control exploratory parameter A_explore according to the following formula:

Among them, reward_ce is the energy consumption evaluation value,

is the deviation of energy consumption,

is the growth rate of energy consumption, and the two 0.5 are the weights of the deviation degree of energy consumption and the growth rate of energy consumption.

It should be noted that the arithmetic average method is used for comprehensive calculation here, and the weight is set to (0.5, 0.5), but the weight can also be set to other values according to actual needs. In other examples, other statistical analysis methods other than average may also be used for calculation, which will not be repeated here.

Then, according to the parameter param_base and parameter param_explore, the cooling capacity evaluation value is obtained as follows:

Specifically, the cooling capacity parameter Cooling_cal is obtained from the water cooling system output state reference parameter param_base as the reference cooling capacity cool_cal_Φ, and the cooling capacity parameter Cooling_cal is obtained from the current water cooling system output state parameter param_explore as the cooling capacity parameter corresponding to the current water cooling system control exploratory parameter cool_cal_t, and calculate the cooling capacity constraint evaluation value g(cool_cal_t, cool_cal_Φ) through the water cooling system control parameter cooling capacity constraint evaluation function g(x). It should be noted that the constraint evaluation function g(x) can be designed according to business requirements, and the specific form of the constraint evaluation function g(x) is not limited.

Finally, obtain the evaluation value of the control action according to the parameter A_base and the parameter A_explore.

Specifically, the current water-cooling system control exploratory parameter constraint evaluation value f(A_base, A_explore) is calculated by the water-cooling system control parameter constraint evaluation function f(x). It should be noted that the water cooling system control parameter constraint evaluation function f(x) can be designed according to business requirements, and the specific form of the water cooling system control parameter constraint evaluation function f(x) is not limited.

Step 508: Obtain a control reward according to the action evaluation value, the cooling capacity evaluation value and the energy consumption evaluation value.

Specifically, the action constraint evaluation value is first obtained according to the action evaluation value and the cooling capacity evaluation value.

Calculate the action constraint evaluation value of the current water cooling system control exploratory parameter A_explore according to the following formula:

safe_eval_action=0.5·f(A_base,A_explore)+0.5·g(cool_cal_t,cool_cal_Φ)

Among them, safe_eval_action is the action constraint evaluation value, f(A_base, A_explore) and g(cool_cal_t, cool_cal_Φ) are the action evaluation value and the coolness evaluation value, respectively, and the two 0.5 are the weights of the action evaluation value and the coolness evaluation value respectively.

Then, the control reward is calculated through the energy consumption evaluation value and the water cooling system control parameter action constraint evaluation value.

Calculated using the following formula:

R=0.5·reward_ce+0.5·safe_eval_action

Among them, R is the control reward, reward_ce is the energy consumption evaluation value, safe_eval_action is the action constraint evaluation value, and the two 0.5 are the weights of the energy consumption evaluation value and the action constraint evaluation value respectively.

It should be noted that the above steps are actually a comprehensive analysis of the action evaluation value and the cooling capacity evaluation value before adding the energy consumption evaluation value to continue the analysis. In the case of direct analysis of the three, the above process is equivalent to using the following formula to calculate the control reward. :

R=0.5·reward_ce+0.25·f(A_base,A_explore)+0.25·g(cool_cal_t,cool_cal_Φ)

Among them, R is the control reward, reward_ce is the energy consumption evaluation value, f(A_base, A_explore) is the action evaluation value, g(cool_cal_t, cool_cal_Φ) is the cooling capacity evaluation value, and 0.5, 0.25, and 0.25 are the weights.

Step 509: Store the tuple (S, A_explore, R, S) in the experience sample pool and update the parameter param_hist to the parameter param_explore.

Specifically, to get a tuple (S, A_explore, R, S) is to complete an exploration, and the number of explorations will be increased by 1, and the value of the parameter param_explore needs to be assigned to the parameter param_hist.

Step 510, detecting whether the current number of explorations reaches the maximum number.

Specifically, if yes, go to step 506, if not, go to step 511.

Step 511, using the data in the experience sample pool to optimize the model M2.

Specifically, one sample can obtain the data required for one optimization of the model M2, and the optimization times are incremented by 1 after one time.

Step 512, detecting whether the current optimization times reaches the maximum value.

Specifically, if yes, go to step 505, if not, go to step 513.

In step 513, the model M2 is saved, and an on-line deployment field test is performed on the model M2.

In addition, it should be understood that the division of steps of the various methods above is only for the purpose of describing clearly, and can be combined into one step or split into some steps during implementation, and decomposed into multiple steps, as long as the same logical relationship is included, all Within the protection scope of this patent; adding insignificant modifications to the algorithm or process or introducing insignificant designs, but not changing the core design of the algorithm and process are all within the protection scope of this patent.

The fourth embodiment of the present application relates to an electronic device, as shown in FIG. 6 , comprising: at least one processor 601 ; and a memory 602 communicatively connected to the at least one processor 601 ; wherein the memory 602 stores data that can be accessed by at least one processor 601 . Instructions executed by one processor 601, the instructions are executed by at least one processor 601, so that at least one processor 601 can execute the control model optimization method for a water cooling system described in any of the above method embodiments.

The memory 602 and the processor 601 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 601 and various circuits of the memory 602 together. The bus may also connect together various other circuits, such as peripherals, voltage regulators, and power management circuits, which are well known in the art and therefore will not be described further herein. The bus interface provides the interface between the bus and the transceiver. A transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other devices over a transmission medium. The data processed by the processor 601 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 601 .

Processor 601 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interface, voltage regulation, power management, and other control functions. The memory 602 may be used to store data used by the processor 601 when performing operations.

The fifth embodiment of the present application relates to a computer-readable storage medium storing a computer program. The above method embodiments are implemented when the computer program is executed by the processor.

That is, those skilled in the art can understand that all or part of the steps in the method of implementing the above embodiments can be completed by instructing the relevant hardware through a program, and the program is stored in a storage medium and includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

Those of ordinary skill in the art can understand that the above-mentioned embodiments are specific embodiments for realizing the present application, and in practical applications, various changes in form and details can be made without departing from the spirit and the spirit of the present application. scope.

Claims

A control model optimization method for a water cooling system, comprising:

Get the parameters of the data center;

Create state transition models and control models;

Optimizing the state transition model according to the parameters of the data center;

The control model is optimized according to the optimized state transition model and parameters of the data center, and the optimized control model is obtained.
The method according to claim 1, wherein the parameters of the data center include environmental parameters, operating parameters, control parameters and output parameters, and the optimizing the state transition model according to the parameters of the data center includes:

Inputting the environmental parameters, the operating parameters and the control parameters into the state transition model to obtain predicted output parameters;

calculating a state transition loss according to the predicted output parameter and the output parameter;

The state transition model is optimized according to the state transition loss.
The method according to claim 1 or 2, wherein, optimizing a pre-created control model according to the optimized state transition model and parameters of the data center, and obtaining the optimized control model, comprising: :

Sampling step, randomly select a group of sample data from the parameters of the data center;

Process the sample data by using the control model and the state transition model to obtain training samples;

Optimizing the control model according to the training samples;

Detecting whether the optimization times of the control model reaches a first threshold;

If so, obtain and save the optimized control model;

If not, return to the sampling step.
4. The method of claim 3, wherein the sample data includes the environmental parameters, the operating parameters, and the control parameters, and wherein the sample data is processed using the control model and the state transition model, Obtaining training samples includes:

generating state characterization data according to the environmental parameters and the operating parameters;

Inputting the state representation data into the control model to obtain predictive control parameters;

inputting the predicted control parameters, the environmental parameters and the operating parameters into the state transition model, obtaining the predicted output parameters and updating the environmental parameters and the operating parameters according to the predicted output parameters;

Update the state representation data according to the updated environmental parameters and the operating parameters;

Evaluate the predicted control parameter according to the predicted control parameter, the predicted output parameter and the control parameter, and obtain a control reward;

A training sample is generated according to the state characterization data before updating, the predictive control parameter, the control reward, and the updated state characterization data.
The method of claim 4, further comprising:

If it is detected that the number of training samples does not reach the second threshold, use the control model and the state transition model to process the updated environmental parameters, the updated operating parameters and the predictive control parameters, and obtain training samples until the number of training samples reaches the second threshold.
The method according to claim 4 or 5, wherein, evaluating the predictive control parameter according to the predictive control parameter, the predictive output parameter and the control parameter to obtain a control reward, comprising:

Obtaining the control action evaluation value for the control strategy according to the predicted control parameter and the control parameter;

Obtaining a control effect evaluation value according to the predicted output parameter, wherein the control effect evaluation value includes an energy consumption evaluation value and a cooling capacity evaluation value;

The control reward is obtained by comprehensively analyzing the control action evaluation value, the energy consumption evaluation value and the cooling capacity evaluation value.
The method according to claim 6, wherein the obtaining the control effect evaluation value according to the predicted output parameter comprises:

Obtain the initial energy consumption parameter and the initial cooling capacity parameter according to the first predicted output parameter;

Obtain the current predicted energy consumption parameter and the current predicted cooling capacity parameter according to the current predicted output parameter;

Obtain historical predicted energy consumption parameters from the last predicted output parameters;

Obtain the growth rate of energy consumption according to the current predicted energy consumption parameter and the historical predicted energy consumption parameter;

Obtaining the deviation degree of energy consumption according to the current predicted energy consumption parameter and the initial energy consumption parameter;

averaging the growth rate and the deviation to obtain the energy consumption evaluation value;

Analysis is performed according to the initial cooling capacity parameter and the current predicted cooling capacity parameter to obtain the cooling capacity evaluation value.
The method according to claim 6 or 7, wherein the comprehensive analysis of the control action evaluation value, the energy consumption evaluation value and the cooling capacity evaluation value to obtain the control reward comprises:

Carry out a weighted average to the control action evaluation value and the cooling capacity evaluation value to obtain a constraint evaluation value;

A weighted average is performed on the energy consumption evaluation value and the constraint evaluation value to obtain the control reward.
An electronic device, comprising:

at least one processor; and,

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any one of claims 1 to 8 The control model optimization method of the water cooling system is described.
A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the control model optimization method for a water cooling system according to any one of claims 1 to 8 is implemented.