CN113518962A - Hybrid learning neural network architecture - Google Patents

Hybrid learning neural network architecture Download PDF

Info

Publication number
CN113518962A
CN113518962A CN201980093428.6A CN201980093428A CN113518962A CN 113518962 A CN113518962 A CN 113518962A CN 201980093428 A CN201980093428 A CN 201980093428A CN 113518962 A CN113518962 A CN 113518962A
Authority
CN
China
Prior art keywords
neural network
components
computer room
output
input feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980093428.6A
Other languages
Chinese (zh)
Inventor
理栈
任志星
张云
王加龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Publication of CN113518962A publication Critical patent/CN113518962A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

Systems and methods are provided for predicting energy efficiency of a computer room in a data center, and more particularly, for predicting Power Usage Efficiency (PUE) of a computer room at optimized parameters using a two-tower deep learning architecture. The two-tower deep learning architecture can automatically learn embedding from data and ontology structures, and can include training of two sub-networks, such as a first neural network that captures domain knowledge embedded in an ontology and a second neural network that predicts PUEs from inputs. The learning of the first neural network, which may be unsupervised, and the second neural network, which may be supervised, may be simultaneous and referred to as hybrid learning, and the dual-tower deep learning architecture may also be referred to as a Hybrid Learning Neural Network (HLNN) architecture.

Description

Hybrid learning neural network architecture
Background
In a computer room of a data center, an environmental control system, such as a heating, ventilation, and air conditioning (HVAC) system, is provided to maintain an acceptable operating environment for computing devices in the computer room, including components such as servers, power supplies, displays, routers, networks, and communication modules. Based on the total energy consumed by the computer room and the total energy consumed by the computing devices, a Power Usage Efficiency (PUE) may be calculated and used to evaluate the energy efficiency of the computer room. The HVAC system may include many repeating and/or similar components such as chillers, fans, secondary pumps, air conditioners, refrigeration equipment, water pumps, such as Chilled Water Pumps (CWP), Secondary Chilled Water Pumps (SCWP), and the like. For example, it is not uncommon to equip fifty or more computer room air conditioning units (CRACs) in a computer room with tens of temperature and humidity sensors.
One method for optimizing PUEs is Computational Fluid Dynamics (CFD), which utilizes numerical analysis and data structures to analyze and solve problems involving fluids. CFD contains partial differential equations for providing predictions of air flow and heat distribution in computer rooms, and is widely used in the design phase. However, CFD-based methods are computationally intensive and less suitable for real-time operation. Furthermore, intensive verification and validation is required to ensure the accuracy of the simulation, especially when variations are involved.
Another method that may be used to predict PUEs and/or control HVAC systems is deep learning neural networks. Deep learning neural networks do not depend on any physical model and do not distinguish between various input features. Classical neural networks only derive knowledge/relationships from historical data, and do not have any domain knowledge. The general deep learning model is more difficult to adapt to systems with a large number of repetitive and similar devices, such as computer rooms of data centers. Although these HVAC components have complex non-linear correlations, the inputs from the sensors are treated equally from the perspective of the neural network architecture, and the information behind the data from the inputs may be biased by repetitive and/or similar inputs, which may result in over-fitting and ultimately inaccuracies, causing inefficiencies.
To avoid repetitive and/or similar inputs and improve the PUE or energy consumption of a computer room, a popular solution is to manually aggregate the inputs based on the domain knowledge of human experts and set the aggregated inputs as inputs to a neural network. However, this solution is machine room specific and introduces additional manual work. Furthermore, because this solution relies on experience and analysis by HVAC experts, it is difficult to understand the most reasonable correlations between various HVAC components sufficiently to achieve energy efficient computer room conditions under different operating conditions such as external temperature, external humidity, computational load, and the like.
Drawings
The detailed description is set forth with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items or features.
FIG. 1 illustrates an example block diagram of an environmental control system for use with a Hybrid Learning Neural Network (HLNN) that may be utilized to predict Power Usage Efficiency (PUE) for a computer room.
FIG. 2 illustrates an example detailed block diagram of the environmental control system of FIG. 1 at a relevant level.
Fig. 3 illustrates an example block diagram of an HLNN architecture.
Fig. 4 illustrates an example flow chart describing a process for predicting PUEs by HLNN.
Detailed Description
The systems and methods discussed herein relate to predicting energy efficiency of a computer room in a data center, and more particularly to using a two-tower deep learning architecture to predict Power Usage Efficiency (PUE) of a computer room in optimized parameters. The two-tower deep learning architecture can automatically learn embedding from data and ontology structures, and can include simultaneous training of two sub-networks, an unsupervised auto-encoder network (AE-Net) that captures domain knowledge embedded in the ontology, and a supervised prediction network (P-Net) that predicts PUEs from inputs. AE-Net (unsupervised) and P-Net (supervised) simultaneous learning may be referred to as hybrid learning, while the two-tower deep learning architecture may also be referred to as a Hybrid Learning Neural Network (HLNN) architecture.
To achieve PUE optimization by ensuring a reasonable and appropriate operating environment, such as the environment of a computer room, and reducing waste in setting up components of an environmental control system, such as an HVAC system, machine learning methods can be used to learn from historical data to obtain complex relationships between various HVAC components and energy efficiency of the computer room under different operating conditions.
In the HLNN architecture, a first neural network and a second neural network, such as AE-Net and P-Net, respectively, can share a shared structure that includes one input layer and two concept layers. Each of AE-Net and P-Net may have its own hidden and output layers. AE-Net, P-Net, and shared structures may form a Hybrid Learning Neural Network (HLNN) architecture. AE-Net may be an unsupervised learning network that may be trained to have its output replicate the input with the lowest possible error, whereas P-Net may be a deep feed-forward neural network for predicting PUE.
For example, domain knowledge of components associated with the HVAC system and computing devices of a computer room may be embedded in the HLNN architecture. By embedding domain knowledge of the components, the number of inputs and complexity of the search space can be reduced, and the accuracy of the PUE prediction can be improved. The design of a dual-tower deep learning architecture for the input layer and the concept layer may be guided by a domain ontology that contains multiple levels of nodes, where the top level may contain the root concept and the bottom level may contain multiple instances. Instances in the bottom level of the ontology may be represented by nodes in the input layer, and concepts in the middle level may also have corresponding nodes in the concept layer of the shared structure. Further, the relationships and/or connections between the levels may be replicated in the input and concept layers.
Fig. 1 illustrates an example block diagram of an environment control system 100 for use with a Hybrid Learning Neural Network (HLNN) that may be utilized to predict Power Usage Efficiency (PUE) of a computer room 102.
The environmental control system 100 may include a number of components, such as a device and data collection module 104 communicatively coupled to an HVAC bank 106 and an external device and data bank 108. The equipment and data collection module 104 may be configured to maintain a configuration file of the components managed by the HVAC group 106 and the external equipment and data group 108, receive input data from various sensors associated with those components, and send data to those components to partially control the environment of the computer room 102, and to calculate a predicted PUE of the computer room 102. Some of the environmental control system components may be located in the computer room 102, while other components may be located outside of the building in which the computer room 102 is located. The environmental control system 100 may monitor the energy consumption of components associated with the computer room 102, the equipment and data collection module 104, the HVAC bank 106, and the external equipment and data bank 108. Additionally, the environmental control system 100 may be communicatively coupled to a computer 110. The computer 110 may include one or more processors 112 and a memory 114 communicatively coupled to the one or more processors 112, which may store computer-readable instructions to be executed by the computer 110 to perform the functions of the HLNN described below. The computer 110 may be located within the computer room 102 or may be located remotely from the computer room 102.
The computer room 102 may house computing devices 116 including servers, power supplies, displays, routers, network and communications modules, and the like (not shown). The computing device 116 may be coupled to the environmental control system 100 and may provide information regarding the energy usage of the computing device 116 for the predicted PUE of the computer room 102 based on historical, current, and expected energy usage and computing loads.
FIG. 2 illustrates an example detailed block diagram of the environmental control system 100 of FIG. 1 in relevant stages (stages 1-4 shown).
The HVAC bank 106 may include an HVAC control module 202, an air conditioning bank 204, and a refrigeration bank 206 communicatively coupled to the equipment and data collection module 104. The HVAC control module 202 may be configured to receive operational information from various sensors and controllers of the air conditioning group 204 and the refrigeration group 206. The HVAC control module 202 may forward the operational information to the device and data collection module 104 for computation by the HLNN. The HVAC control module 202 may also be configured to send control information received from the equipment and data collection module 104 to the air conditioning group 204 and the refrigeration group 206 for adjusting various parameters of the air conditioning group 204 and the refrigeration group 206 to optimize desired parameters for predicting PUEs. The HVAC bank 106 may also include a secondary pump bank (not shown) and may similarly communicate relevant operational information to and from the HVAC control module 202.
Air conditioning pack 204 may include N air conditioners (two, AC-1208 and AC-N210, as shown). Although not shown, each of the N air conditioners may include several controllers and sensors, such as a corresponding switch, a corresponding fan speed controller/sensor, a corresponding air conditioner output air temperature sensor, and a corresponding air conditioner return air temperature sensor. Each of the N air conditioners may be configured to receive AC operation information from the corresponding controller and sensor and forward the AC operation information to the air conditioning system 204, which in turn forwards the AC operation information to the HVAC control module 202. Each of the N air conditioners may also be configured to send AC control information received from the air conditioning system 204 to the corresponding control to optimize the desired parameters for predicting the PUE.
Refrigeration package 206 may include multiple refrigeration systems including multiple chillers (chiller 1212 shown) and multiple cooling towers (tower 1214 shown). Although not shown, each of the plurality of coolers may include an associated switch, cooling mode controller, leaving chilled water temperature controller/sensor, and each of the plurality of cooling towers may include an associated cooling tower fan speed controller/sensor, leaving chilled water temperature controller/sensor, and returning chilled water temperature controller/sensor.
Each of the plurality of refrigeration systems may be configured to receive refrigeration operation information from corresponding controls, switches, and sensors (not shown) and forward the refrigeration operation information to the HVAC control module 202 via the refrigeration group 206. Each of the plurality of refrigeration systems may also be configured to send the refrigeration control information received from the refrigeration train 206 to the corresponding controls, switches, and sensors to optimize the desired parameters for predicting the PUE.
The external device and data set 108 may include an external device monitoring module 216, an external humidity module 218, an external wet bulb temperature module 220, and other modules (not shown) communicatively coupled to the device and data collection module 104. The external humidity module 218 may be communicatively coupled to M humidity sensors (two humidity sensors, humidity sensor-1222 and humidity sensor-M224, as shown). External wet bulb temperature module 220 may be communicatively coupled to M wet bulb temperature sensors (two wet bulb temperature sensors, wet bulb temperature sensor-1226 and wet bulb temperature sensor-M228, as shown). The external equipment monitoring module 216 may receive humidity and wet bulb temperature information from the corresponding sensors and forward this information to the equipment and data collection module 104 for optimization of desired parameters for predicting PUEs.
Each block illustrated in fig. 2 may be associated with one of a plurality of levels of a domain ontology. A domain ontology having four levels is illustrated herein as an example, however, the number of levels of the domain ontology may not be limited to four and may also be more or less than four levels. Level 1 may include a device and data collection module 104, which may be referred to as D1. Stage 2 may include q modules including the HVAC control module 202 and the external device monitoring module 216, which may be referred to as C _1, C _ 2. Stage 3 may include p modules including an air conditioning pack 204, a refrigeration pack 206, an external humidity module 218, and an external wet bulb temperature module 220, which may be referred to as B _1, B _2,. B _ p, respectively. Stage 4 may include k modules including AC-1208, AC-N210, chiller-1212, tower-1214, humidity sensor-1222, humidity sensor-M224, wet bulb temperature sensor-1226, and wet bulb temperature sensor-M228, which may be referred to as a _1, a _2,.. a _ k, respectively.
Fig. 3 illustrates an example block diagram of an HLNN architecture 300.
The HLNN structure 300 can include a domain ontology 302, a shared structure 304, a first neural network such as AE-Net 306, and a second neural network such as P-Net 308. There may be multiple levels in the ontology, and four levels corresponding to the blocks illustrated in fig. 2 are shown in the domain ontology 302 as an example. The top level 1 may contain the root concept D _1310, while the bottom level 4 may contain multiple instances, of which four instances a _ 1312, a _ 2314, a _ n 316, and a _ k 318 are shown. These four instances in the 4 th level of the domain ontology 302 may be represented as nodes a _ 1320, a _ 2322, a _ n 324, and a _ k 326, respectively, in the input layer 328 of the shared fabric 304.
The second level 2 and third level 3 of the domain ontology 302 may represent multiple instances, where two concepts C _ 1330 and C _ q 332 (in level 2) and three concepts B _ 1334, B _ 2336, and B _ p 338 (in level 3) are shown. These instances in level 2 and level 3 of the domain ontology 302 may also have corresponding nodes C _ 1340, C _ q 342, B _ 1344, B _ 2346, and B _ p 348, respectively, in the conceptual layer 350 of the shared fabric 304. Additionally, the relationships/connections between the levels may also be replicated in the input layer 328 and the concept layer 350. For example, in the domain ontology 302, the concept B _ 1334 is shown connected to a set of instances a _ 1312, a _ 2314, and a _ n 316, and in the concept layer 350, the corresponding node B _ 1344 is also shown connected to the corresponding nodes a _ 1320, a _ 2322, and a _ n 324 in the input layer 328.
The P-Net 308 may be a deep feed-forward neural network and may include a hidden layer 352 and a single node output layer 354 for outputting PUE parameters 356, plus an input layer 328 and a concept layer 350 of the shared fabric 304. An example feed-forward operation of the P-Net 308 is described below. Neurons and nodes may be used interchangeably.
Order to
Figure BDA0003239749970000064
Represents a weight between a jth neuron or node in the (l-1) th layer and an ith neuron in the l-th layer, and
Figure BDA0003239749970000065
is the deviation of the ith neuron in the l-th layer. Using these representations, feed forward operation can be described as
Figure BDA0003239749970000061
Wherein
Figure BDA0003239749970000066
Is a weighted input to the node i,
Figure BDA0003239749970000067
is the output of the j-th node in the (l-1) -th layer, and Rl-1Is the number of neurons in layer (l-1).
Given a
Figure BDA0003239749970000069
And
Figure BDA00032397499700000610
equation (1) can be simplified to:
Figure BDA0003239749970000062
using the above notation, node i is activated
Figure BDA0003239749970000063
Wherein f ispIs an activation function.
In the shared fabric 304, the connection may be guided by domain knowledge, which may not adequately connect the nodes in the concept layer 350. Order to
Figure BDA00032397499700000611
Representing a conceptual relationship between two concept nodes, node j and node i, the weighted input for node i can then be expressed as:
Figure BDA0003239749970000071
each of the concept layers 350 may be mapped from a corresponding level of concepts in the domain ontology 302. Order to
Figure BDA0003239749970000073
Representing the number of nodes (instances and concepts) in the concept hierarchy that are connected to the ith node in the ith layer (i.e., node C _ i in the domain ontology 302), and then may weight the concept relationships
Figure BDA0003239749970000074
Is expressed as
Figure BDA0003239749970000072
That is, if the sub-concept/instance node j corresponding to level (l-1) is not connected to the concept of node i, then
Figure BDA0003239749970000075
Is zero. If it is not
Figure BDA0003239749970000076
The conceptual relationship weight is 1, which does not affect the learning process. Loss function LPN(a,dp) Then, the input a can be represented by a, opRepresenting the calculated output of the neural network and let dpRepresenting the desired output to define dpAnd opThe error between.
The AE-Net 306 may be an unsupervised learning model that includes a hidden layer 358 and an output layer 360 plus an input layer 328 and a concept layer 350 of the shared structure 304. The AE-Net 306 can be designed to minimize the difference between the input from the input layer 328 and the output from the output layer 360 of the shared fabric 304. Considering input vector a from input layer 328, representation vector c from the top conceptual layer of conceptual layer 350, and output vector R from output layer 360 (shown as R _1362 and R _ k 364), the mapping that transforms a to c may be referred to as an encoder, while the mapping that transforms c back to R may be referred to as a decoder. The encoder may consist of an input layer 328 and a concept layer 350, whereas the decoder may consist of a hidden layer 358 and an output layer 360. The training process in AE-Net 306 can help the encoder save domain knowledge in the domain ontology 302.
Setting the input vector a to a ═ a1,a2,...,akRepresents that the vector c can be expressed as c ═ c1,c2,...,cqAnd the output vector r may be expressed as r ═ r1,r2,...,rkH, the encoder function f can be setθAnd decoder function gθExpressed as:
c=fθ(a) (6)
r=gθ(c) (7)
at the encoder function fθAnd decoder function gθWhere W and W' are encoder and decoder weight matrices, and b and d are encoder and decoder bias vectors. The encoder function f may then be adjustedθAnd decoder function gθAre expressed as:
fθ(a)=sf(b+Wa) (8)
gθ(c)=sg(d+Wc) (9)
wherein s isfAnd sgAre the encoder and decoder activation functions. In terms of probability, R is not the exact reconstruction of a but a parameter that generates the distribution p of a with high probability (ar ═ R). AE-Net 306 can be trained to find a parameter set that minimizes the reconstruction error in the following equation:
EAE(θ)=∑a∈ALAE(a,r)=∑a∈ALAE(a,gθ(fθ(a))) (10)
where A represents a training example set, LAEIs a loss function or reconstruction error. The input vector may be real and the loss function LAEMay be the square error LAE(a,r)=||a-r||2。sfAnd sgMay be a sigmoid function.
The HLNN 300 may be trained in a similar manner as a standard neural network. The only difference may be the loss function LModelCan consist of two components: loss L of AE-Net 306AEAnd predicted loss L of P-Net 308PN
LModel=LPN+αLAE (11)
Wherein α is to LAEA constant that provides a bias or weight. Alternatively, LPNMay be biased or weighted by another constant, beta, and L may beModelIs expressed as LModel=βLPN+LAE
In training, the derivative that can be lost is expressed as
Figure BDA0003239749970000081
And the following substitution can be made:
Figure BDA0003239749970000082
Figure BDA0003239749970000083
wherein R isl+1Is the number of nodes in the (l +1) th layer. Combining equations (12), (13) and (14) yields
Figure BDA0003239749970000084
If layer l +1 is in the shared structure 304, i.e., the input layer 328 and the concept layer 350, equation (15) may be transformed to:
Figure BDA0003239749970000091
if layer l +1 is in the hidden layer 352 or the output layer 354 of the P-Net 308, equation (15) can be transformed to:
Figure BDA0003239749970000092
wherein
Figure BDA0003239749970000095
Indicating the number of nodes in the (l +1) th layer used to compute the output of the P-Net 308.
If layer l +1 is the hidden layer 358 or the output layer 360 of the AE-Net 306, equation (15) can be transformed to:
Figure BDA0003239749970000093
wherein
Figure BDA0003239749970000094
Indicates the number of nodes in the (l +1) th layer used to calculate the output of the AE-Net 306.
Equations (16), (17) and (18) show the loss function LModelIs propagated back for learning both AE-Net 306 and P-Net 308. Can be obtained by passing the loss function L as expressed by equation (11)ModelThe calculated loss is minimized to optimize the solution of the PUE, which can be achieved by minimizing the loss function LModelIs set to zero as in equations (16), (17) and (18) and the variables are solved for. Because the solution may not always converge to zero, or may take longer than an acceptable time or number of iterations, the value of the derivative may be set to a sufficiently small and acceptable threshold.
Fig. 4 illustrates an example flow diagram 400 describing a process for predicting Power Utilization Efficiency (PUE) by the HLNN 300.
At block 402, the HLNN 300 may create an ontology having multiple levels, such as the domain ontology 302, of the components associated with the environmental control system 100, which in turn is associated with the computer room 102, as illustrated in fig. 1-3. The HLNN 300 may automatically receive information for components associated with the environmental control system 100, including corresponding relevant historical data, locations, and physical connections, as well as hierarchies between components as illustrated in fig. 1-3. The computing device 116 may include servers, power supplies, displays, routers, network and communication modules (telephone, internet, wireless, etc.), and the like. The relationship between the components of the environmental control system 100 and the computing device 116 can be based on the load of the computing device 116, such as the workload or computing load of the server and the electrical load of the server as a function of the workload of the server.
At block 404, the HLNN 300 may receive input characteristic parameters for components associated with the environmental control system 100. More specifically, the input layer 328 of the shared fabric 304 may receive k instances, a _ 1312, a _ 2314, a _ n 316, and a _ k 318, from the domain ontology 302, where k is an integer. Each of the k instances may have a corresponding input feature parameter (a _ 1320, a _ 2322, a _ n 324, and a _ k 326 as illustrated in fig. 3) in the input layer 328 that may belong to one or more corresponding upper concepts of the plurality of upper concepts as hierarchically illustrated in the concept layer 350.
At block 406, both the first neural network, e.g., AE-Net 306, and the second neural network, e.g., P-Net 308, may be trained simultaneously. As discussed above, the input vector a or the input feature parameter may be expressed as a ═ a1,a2,., ak, representing a vector c or concept may be expressed as c ═ c1,c2,...,cqAnd the output vector r may be expressed as r ═ r1,r2,...,rk}. The mapping that transforms a to c may be referred to as an encoder, and the mapping that transforms c back to r may be referred to as a decoder. The encoder may be comprised of an input layer 328 and a concept layer 350, while the decoder may be comprised of a hidden layer 358 and an output layer 360. The training process in AE-Net 306 can help the encoder save domain knowledge in the domain ontology 302.
At block 408, the HLNN 300 may be based on the loss function L by utilizing the trained AE-Net 306 and the trained P-Net 308ModelLosses are minimized and Power Usage Efficiency (PUE) of the computer room 102 is predicted at block 410. Loss function LModelThe derivatives of (a), such as equations (16), (17), and (18), may be set to zero for solving for the variables. Because the solution may not always converge to zero, or may take longer than an acceptable time, the value of the derivative may be set to a sufficiently small and acceptable threshold.
The trained neural network may be automatically generated, and the training of the trained neural network may be performed by using a gradient descent algorithm to enable learning of the input feature parameters for the corresponding concept. The architecture of the trained neural network can reflect deep learning of multiple components and related concepts based on relationships between the multiple components. The trained neural network may include a hierarchical concept layer, such as concept layer 350, coupled between an input layer, such as input layer 328, and an output layer, such as output layer 354 or 360. A conceptual layer 350 may be added between the input layer 328 and the hidden layers 352 and 358 as illustrated in fig. 3. The concept layer 350 may be embedded with domain knowledge from the domain ontology 302. The concept layer 350 may construct a concept structure based on relationships between multiple components. The concept structure can be created manually or automatically with intelligent components that can communicate with each other. The training portion of the HLNN 300 and the prediction of PUEs using the HLNN 300 may be performed separately and/or by different parties.
A typical deep learning network may not be able to reasonably distinguish between repeated and/or similar input features and may identify the importance of each feature based entirely on historical data. In a structure such as a computer room 102 with a large number of repetitions and similar devices, if these repeated and/or similar input feature parameters are not categorized, aggregated, or abstracted, the complexity of the network and the space for learning and searching will increase significantly, requiring higher quality and quantity of data. While it may be easy to obtain an unreasonable overfitting, prediction accuracy may be reduced.
Some or all of the operations of the above-described methods may be performed by executing computer readable instructions stored on a computer readable storage medium as defined below. The term "computer readable instructions" as used in the specification and claims includes routines, applications, application modules, program modules, programs, components, data structures, algorithms, and the like. Computer-readable instructions can be implemented on various system configurations, including single-processor or multiprocessor systems, minicomputers, mainframe computers, personal computers, hand-held computing devices, microprocessor-based, programmable consumer electronics, combinations thereof, and the like.
The computer-readable storage medium may include volatile memory (such as Random Access Memory (RAM)) and/or nonvolatile memory (such as Read Only Memory (ROM), flash memory, etc.). Computer-readable storage media may also include additional removable and/or non-removable storage devices, including, but not limited to, flash memory, magnetic storage devices, optical storage devices, and/or tape storage devices, which may provide non-volatile storage of computer-readable instructions, data structures, program modules, and the like.
Non-transitory computer readable storage media are examples of computer readable media. Computer-readable media includes at least two types of computer-readable media, namely computer-readable storage media and communication media. Computer-readable storage media includes volatile and nonvolatile, removable and non-removable media implemented in any process or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer-readable storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information for access by a computing device. In contrast, communication media may embody computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism. As defined herein, computer-readable storage media does not include communication media.
The computer-readable instructions stored on the one or more non-transitory computer-readable storage media, when executed by the one or more processors, may perform the operations described above with reference to fig. 1-4. Generally, computer readable instructions include routines, programs, objects, components, data structures, etc. that perform particular functions or implement particular abstract data types. The order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement the processes.
Example clauses
A. A method, the method comprising: receiving input characteristic parameters of a plurality of components associated with at least one computer room; training a first neural network and a second neural network based on the input feature parameters; and predicting Power Usage Efficiency (PUE) for the at least one computer room based on the output of the first neural network and the output of the second neural network.
B. The method of paragraph a, wherein the first neural network is an unsupervised neural network and the second network is a supervised predictive neural network.
C. The method of paragraph a, wherein training the first neural network and the second neural network based on the input feature parameters comprises simultaneously training the first neural network and the second neural network based on the input feature parameters.
D. The method of paragraph a, wherein receiving input feature parameters for the plurality of components associated with the at least one computer room comprises: creating an ontology having a plurality of levels associated with the plurality of components; and receiving information of the plurality of components based on the ontology, including corresponding related concepts, historical data, locations, physical connections, and hierarchies between the plurality of components.
E. The method of paragraph D, wherein the relationship between the plurality of components is based at least in part on a load of computing devices in the computer room.
F. A method as paragraph E recites, wherein the load of the computing device includes a workload of the computing device and an electrical load used by the computing device.
G. The method of paragraph F, wherein the computing device includes a server and a power source for the server.
H. The method of paragraph D, wherein training the first and second neural networks based on the input feature parameters includes using a gradient descent algorithm to effect learning of the input feature parameters for corresponding concepts.
I. The method of paragraph a, wherein predicting PUEs for the at least one computer room based on the output of the first neural network and the output of the second neural network comprises biasing a loss associated with the first neural network with a constant value.
J. The method of paragraph I, wherein predicting PUEs of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises minimizing a total loss calculated based on biased losses associated with the first neural network and unbiased losses associated with the second neural network.
K. The method of paragraph J, wherein minimizing the loss calculated based on the loss function by utilizing the first neural network and the second neural network comprises solving for a derivative of the loss function to be equal to zero.
L. the method of paragraph J, wherein minimizing the loss calculated based on the loss function by utilizing the trained first neural network and the trained second neural network comprises solving for a derivative of the loss function to be less than or equal to a threshold.
A system, the system comprising: one or more processors; and a memory communicatively coupled to the one or more processors, the memory storing computer-readable instructions executable by the one or more processors, the computer-readable instructions, when executed by the one or more processors, causing the one or more processors to perform operations comprising: receiving input characteristic parameters of a plurality of components associated with at least one computer room; training a first neural network and a second neural network based on the input feature parameters; and predicting Power Usage Efficiency (PUE) for the at least one computer room based on the output of the first neural network and the output of the second neural network.
The system of paragraph M, wherein the first neural network is an unsupervised neural network and the second network is a supervised predictive neural network.
The system of paragraph M, wherein training the first and second neural networks based on the input feature parameters comprises simultaneously training the first and second neural networks based on the input feature parameters.
P. the system of paragraph M, wherein receiving input feature parameters for the plurality of components associated with the at least one computer room comprises: creating an ontology having a plurality of levels associated with the plurality of components; and receiving information of the plurality of components based on the ontology, including corresponding related concepts, historical data, locations, physical connections, and hierarchies between the plurality of components.
Q. the system of paragraph P, wherein the relationship between the plurality of components is based at least in part on a load of the computing device.
R. the system of paragraph Q, wherein the load of the computing device includes a workload of the computing device and an electrical load used by the computing device.
The system of paragraph R, wherein the computing device includes a server and a power source for the server.
T. the system of paragraph P, wherein training the first and second neural networks based on the input feature parameters comprises using a gradient descent algorithm to effect learning of the input feature parameters for corresponding concepts.
The system of paragraph M, wherein predicting PUEs for the at least one computer room based on the output of the first neural network and the output of the second neural network comprises biasing with a constant value a loss associated with the first neural network.
V. the system of paragraph U, wherein predicting PUEs of the at least one computer room based on outputs of the first neural network and the second neural network comprises minimizing a total loss calculated based on biased losses associated with the first neural network and unbiased losses associated with the second neural network.
W. the system of paragraph V, wherein minimizing the loss calculated based on the loss function by utilizing the first and second neural networks comprises solving for a derivative of the loss function equal to zero.
X. the system as recited in paragraph V, wherein minimizing the loss calculated based on the loss function by utilizing the trained first neural network and the trained second neural network comprises solving for a derivative of the loss function to be less than or equal to a threshold.
A non-transitory computer-readable storage medium storing computer-readable instructions executable by one or more processors, which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: receiving input characteristic parameters of a plurality of components associated with at least one computer room; training a first neural network and a second neural network based on the input feature parameters; and predicting Power Usage Efficiency (PUE) for the at least one computer room based on the output of the first neural network and the output of the second neural network.
Z. the non-transitory computer-readable storage medium of paragraph Y, wherein the first neural network is an unsupervised neural network and the second network is a supervised predictive neural network.
The non-transitory computer readable storage medium of paragraph Y, wherein training the first neural network and the second neural network based on the input feature parameters comprises simultaneously training the first neural network and the second neural network based on the input feature parameters.
The non-transitory computer-readable storage medium of paragraph Y, wherein receiving input feature parameters for the plurality of components associated with the at least one computer room comprises: creating an ontology having a plurality of levels associated with the plurality of components; and receiving information of the plurality of components based on the ontology, including corresponding related concepts, historical data, locations, physical connections, and hierarchies between the plurality of components.
A non-transitory computer-readable storage medium as paragraph AB recites, wherein relationships between the plurality of components are based at least in part on a load of computing devices in the computer room.
AD. the non-transitory computer readable storage medium of paragraph AC wherein the load of the computing device includes a workload of the computing device and an electrical load used by the computing device.
AE. the non-transitory computer readable storage medium of paragraph AD wherein the computing device includes a server and a power source for the server.
The non-transitory computer-readable storage medium of paragraph AB, wherein training the first and second neural networks based on the input feature parameters comprises using a gradient descent algorithm to enable learning of the input feature parameters for corresponding concepts.
The non-transitory computer-readable storage medium of paragraph Y, wherein predicting PUEs of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises biasing with a constant value a loss associated with the first neural network.
AH. the non-transitory computer-readable storage medium of paragraph AG, wherein predicting the PUE of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises minimizing a total loss calculated based on biased losses associated with the first neural network and unbiased losses associated with the second neural network.
A non-transitory computer readable storage medium as paragraph AH, wherein minimizing the loss calculated based on the loss function by utilizing the first and second neural networks comprises solving for a derivative of the loss function equal to zero.
Aj. the non-transitory computer-readable storage medium of paragraph AH, wherein minimizing the loss calculated based on the loss function by utilizing the trained first neural network and the trained second neural network comprises solving for a derivative of the loss function to be less than or equal to a threshold.
Conclusion
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as example forms of implementing the claims.

Claims (36)

1. A method, the method comprising:
receiving input characteristic parameters of a plurality of components associated with at least one computer room;
training a first neural network and a second neural network based on the input feature parameters; and
predicting Power Usage Efficiency (PUE) for the at least one computer room based on an output of the first neural network and an output of the second neural network.
2. The method of claim 1, wherein the first neural network is an unsupervised neural network and the second network is a supervised predictive neural network.
3. The method of claim 1, wherein training the first neural network and the second neural network based on the input feature parameters comprises simultaneously training the first neural network and the second neural network based on the input feature parameters.
4. The method of claim 1, wherein receiving input feature parameters for the plurality of components associated with the at least one computer room comprises:
creating an ontology having a plurality of levels associated with the plurality of components; and
receiving information for the plurality of components based on the ontology, including corresponding related concepts, historical data, locations, physical connections, and hierarchies between the plurality of components.
5. The method of claim 4, wherein the relationship between the plurality of components is based at least in part on a load of computing devices in the computer room.
6. The method of claim 5, wherein the load of the computing device includes a workload of the computing device and an electrical load used by the computing device.
7. The method of claim 6, wherein the computing device comprises a server and a power source for the server.
8. The method of claim 4, wherein training the first and second neural networks based on the input feature parameters comprises using a gradient descent algorithm to enable learning of the input feature parameters for corresponding concepts.
9. The method of claim 1, wherein predicting the PUE of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises biasing a loss associated with the first neural network with a constant value.
10. The method of claim 9, wherein predicting the PUE of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises minimizing a total loss calculated based on biased losses associated with the first neural network and unbiased losses associated with the second neural network.
11. The method of claim 10, wherein minimizing the loss calculated based on the loss function by utilizing the first neural network and the second neural network comprises solving for a derivative of the loss function to be equal to zero.
12. The method of claim 10, wherein minimizing the loss calculated based on the loss function by utilizing the trained first neural network and the trained second neural network comprises solving for a derivative of the loss function to be less than or equal to a threshold.
13. A system, the system comprising:
one or more processors; and
a memory communicatively coupled to the one or more processors, the memory storing computer-readable instructions executable by the one or more processors, the computer-readable instructions, when executed by the one or more processors, causing the one or more processors to perform operations comprising:
receiving input characteristic parameters of a plurality of components associated with at least one computer room;
training a first neural network and a second neural network based on the input feature parameters; and
predicting Power Usage Efficiency (PUE) for the at least one computer room based on an output of the first neural network and an output of the second neural network.
14. The system of claim 13, wherein the first neural network is an unsupervised neural network and the second network is a supervised predictive neural network.
15. The system of claim 13, wherein training the first neural network and the second neural network based on the input feature parameters comprises simultaneously training the first neural network and the second neural network based on the input feature parameters.
16. The system of claim 13, wherein receiving input feature parameters for the plurality of components associated with the at least one computer room comprises:
creating an ontology having a plurality of levels associated with the plurality of components; and
receiving information for the plurality of components based on the ontology, including corresponding related concepts, historical data, locations, physical connections, and hierarchies between the plurality of components.
17. The system of claim 16, wherein the relationship between the plurality of components is based at least in part on a load of the computing device.
18. The system of claim 17, wherein the load of the computing device includes a workload of the computing device and an electrical load used by the computing device.
19. The system of claim 18, wherein the computing device comprises a server and a power source for the server.
20. The system of claim 16, wherein training the first and second neural networks based on the input feature parameters comprises using a gradient descent algorithm to enable learning of the input feature parameters for corresponding concepts.
21. The system of claim 13, wherein predicting the PUE of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises biasing a loss associated with the first neural network with a constant value.
22. The system of claim 21, wherein predicting the PUE of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises minimizing a total loss calculated based on biased losses associated with the first neural network and unbiased losses associated with the second neural network.
23. The system of claim 22, wherein minimizing the loss calculated based on the loss function by utilizing the first neural network and the second neural network comprises solving for a derivative of the loss function to be equal to zero.
24. The system as recited in claim 22, wherein minimizing the loss calculated based on the loss function by utilizing the trained first neural network and the trained second neural network comprises solving for a derivative of the loss function to be less than or equal to a threshold.
25. A non-transitory computer-readable storage medium storing computer-readable instructions executable by one or more processors, the computer-readable instructions, when executed by the one or more processors, causing the one or more processors to perform operations comprising:
receiving input characteristic parameters of a plurality of components associated with at least one computer room;
training a first neural network and a second neural network based on the input feature parameters; and
predicting Power Usage Efficiency (PUE) for the at least one computer room based on an output of the first neural network and an output of the second neural network.
26. The non-transitory computer-readable storage medium of claim 25, wherein the first neural network is an unsupervised neural network and the second network is a supervised predictive neural network.
27. The non-transitory computer-readable storage medium of claim 25, wherein training the first neural network and the second neural network based on the input feature parameters comprises simultaneously training the first neural network and the second neural network based on the input feature parameters.
28. The non-transitory computer-readable storage medium of claim 25, wherein receiving input feature parameters for the plurality of components associated with the at least one computer room comprises:
creating an ontology having a plurality of levels associated with the plurality of components; and
receiving information for the plurality of components based on the ontology, including corresponding related concepts, historical data, locations, physical connections, and hierarchies between the plurality of components.
29. The non-transitory computer-readable storage medium of claim 28, wherein the relationship between the plurality of components is based at least in part on a load of computing devices in the computer room.
30. The non-transitory computer-readable storage medium of claim 29, wherein the load of the computing device comprises a workload of the computing device and an electrical load used by the computing device.
31. The non-transitory computer readable storage medium of claim 30, wherein the computing device comprises a server and a power source for the server.
32. The non-transitory computer-readable storage medium of claim 28, wherein training the first and second neural networks based on the input feature parameters comprises using a gradient descent algorithm to enable learning of the input feature parameters for corresponding concepts.
33. The non-transitory computer-readable storage medium of claim 25, wherein predicting PUEs for the at least one computer room based on outputs of the first and second neural networks comprises biasing losses associated with the first neural network with a constant value.
34. The non-transitory computer-readable storage medium of claim 33, wherein predicting the PUE of the at least one computer room based on the output of the first neural network and the output of the second neural network comprises minimizing a total loss calculated based on biased losses associated with the first neural network and unbiased losses associated with the second neural network.
35. The non-transitory computer readable storage medium of claim 34, wherein minimizing the loss calculated based on the loss function by utilizing the first neural network and the second neural network comprises solving for a derivative of the loss function to be equal to zero.
36. The non-transitory computer readable storage medium of claim 34, wherein minimizing the loss calculated based on the loss function by utilizing the trained first neural network and the trained second neural network comprises solving for a derivative of the loss function to be less than or equal to a threshold.
CN201980093428.6A 2019-05-15 2019-05-15 Hybrid learning neural network architecture Pending CN113518962A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/087083 WO2020227983A1 (en) 2019-05-15 2019-05-15 Hybrid-learning neural network architecture

Publications (1)

Publication Number Publication Date
CN113518962A true CN113518962A (en) 2021-10-19

Family

ID=73290100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980093428.6A Pending CN113518962A (en) 2019-05-15 2019-05-15 Hybrid learning neural network architecture

Country Status (2)

Country Link
CN (1) CN113518962A (en)
WO (1) WO2020227983A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113721151A (en) * 2021-11-03 2021-11-30 杭州宇谷科技有限公司 Battery capacity estimation model and method based on double-tower deep learning network
CN115907202A (en) * 2022-12-13 2023-04-04 中国通信建设集团设计院有限公司 Data center PUE calculation analysis method and system under double-carbon background

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113465139B (en) * 2021-05-28 2022-11-08 山东英信计算机技术有限公司 Refrigeration optimization method, system, storage medium and equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076607A1 (en) * 2008-08-08 2010-03-25 Osman Ahmed Data center thermal performance optimization using distributed cooling systems
CN109002942A (en) * 2018-09-28 2018-12-14 河南理工大学 A kind of short-term load forecasting method based on stochastic neural net
CN109670623A (en) * 2017-10-16 2019-04-23 优酷网络技术(北京)有限公司 Neural net prediction method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8395621B2 (en) * 2008-02-12 2013-03-12 Accenture Global Services Limited System for providing strategies for increasing efficiency of data centers
CN103645795A (en) * 2013-12-13 2014-03-19 浪潮电子信息产业股份有限公司 Cloud computing data center energy saving method based on ANN (artificial neural network)

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076607A1 (en) * 2008-08-08 2010-03-25 Osman Ahmed Data center thermal performance optimization using distributed cooling systems
CN109670623A (en) * 2017-10-16 2019-04-23 优酷网络技术(北京)有限公司 Neural net prediction method and device
CN109002942A (en) * 2018-09-28 2018-12-14 河南理工大学 A kind of short-term load forecasting method based on stochastic neural net

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘威 等: ""互学习神经网络训练方法研究"", 《计算机学报》, vol. 40, no. 6 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113721151A (en) * 2021-11-03 2021-11-30 杭州宇谷科技有限公司 Battery capacity estimation model and method based on double-tower deep learning network
CN115907202A (en) * 2022-12-13 2023-04-04 中国通信建设集团设计院有限公司 Data center PUE calculation analysis method and system under double-carbon background
CN115907202B (en) * 2022-12-13 2023-10-24 中国通信建设集团设计院有限公司 Data center PUE (physical distribution element) calculation analysis method and system under double-carbon background

Also Published As

Publication number Publication date
WO2020227983A1 (en) 2020-11-19

Similar Documents

Publication Publication Date Title
Wei et al. Multi-objective optimization of the HVAC (heating, ventilation, and air conditioning) system performance
Kusiak et al. Multi-objective optimization of HVAC system with an evolutionary computation algorithm
Wei et al. Deep reinforcement learning for joint datacenter and HVAC load control in distributed mixed-use buildings
CN113518962A (en) Hybrid learning neural network architecture
US11835928B2 (en) Adaptive mixed integer nonlinear programming for process management
CN110826784B (en) Method and device for predicting energy use efficiency, storage medium and terminal equipment
Wahid et al. An efficient approach for energy consumption optimization and management in residential building using artificial bee colony and fuzzy logic
Fang et al. A neural-network enhanced modeling method for real-time evaluation of the temperature distribution in a data center
Tian et al. An adaptive ensemble predictive strategy for multiple scale electrical energy usages forecasting
Zhang et al. A novel artificial bee colony algorithm for HVAC optimization problems
Kusiak et al. Reheat optimization of the variable-air-volume box
WO2019227273A1 (en) Hierarchical concept based neural network model for data center power usage effectiveness prediction
Cao et al. PSO-Stacking improved ensemble model for campus building energy consumption forecasting based on priority feature selection
Zhang et al. Residual physics and post-posed shielding for safe deep reinforcement learning method
Cho et al. Rule reduction for control of a building cooling system using explainable AI
Li et al. Data-oriented distributed overall optimization for large-scale HVAC systems with dynamic supply capability and distributed demand response
Zhang et al. Automated machine learning-based building energy load prediction method
Guo et al. Fruit fly optimization algorithm based on single-gene mutation for high-dimensional unconstrained optimization problems
Şencan Modeling of thermodynamic properties of refrigerant/absorbent couples using data mining process
CN116954329A (en) Method, device, equipment, medium and program product for regulating state of refrigeration system
CN112234599B (en) Advanced dynamic self-adaptive partitioning method and system for multi-element complex urban power grid
Wang et al. Decentralized optimization algorithms for variable speed pumps operation based on local interaction game
Bahij et al. A comparison study of machine learning methods for energy consumption forecasting in industry
Yu et al. A Combined Neural and Genetic Algorithm Model for Data Center Temperature Control.
Adejokun et al. Weather Analysis Using Neural Networks for Modular Data Centers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination