WO2023155968A1 - Determining an electrical wiring connection configuration of an electrical power system - Google Patents

Determining an electrical wiring connection configuration of an electrical power system Download PDF

Info

Publication number
WO2023155968A1
WO2023155968A1 PCT/EP2022/025170 EP2022025170W WO2023155968A1 WO 2023155968 A1 WO2023155968 A1 WO 2023155968A1 EP 2022025170 W EP2022025170 W EP 2022025170W WO 2023155968 A1 WO2023155968 A1 WO 2023155968A1
Authority
WO
WIPO (PCT)
Prior art keywords
pdu
electrical equipment
outlets
power
activity
Prior art date
Application number
PCT/EP2022/025170
Other languages
French (fr)
Inventor
Daniel ZUCCHETTO
Maebh LARKIN
Nathan CUNNINGHAM
Niall CAHILL
Fiaz SHAIK
Neil BROCKETT
Original Assignee
Eaton Intelligent Power Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eaton Intelligent Power Limited filed Critical Eaton Intelligent Power Limited
Publication of WO2023155968A1 publication Critical patent/WO2023155968A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/18Packaging or power distribution
    • G06F1/189Power distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/394Routing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the invention relates to monitoring an electrical power system comprising a plurality of power distribution units (PDUs) and a plurality of electrical equipment units to be provided with electrical power.
  • PDUs power distribution units
  • the invention relates to determining a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units.
  • Electrical power systems control the delivery of electrical power to individual electrical equipment units or users that require such electrical power. For instance, in a data centre electrical power is delivered to individual units of electrical equipment in a server room, e.g. individual server machines, from power distribution units (PDUs). In particular, electrical power is delivered via electrical wiring connections between outlets of the PDUs and the server machines.
  • PDUs power distribution units
  • Such an electrical wiring configuration can change relatively frequently over time; for instance, when server machines are swapped in and out of service, or when maintenance is to be performed on certain components of the electrical power system.
  • a computer-implemented method for monitoring an electrical power system comprising a plurality of power distribution units (PDUs) and a plurality of electrical equipment units to be provided with electrical power.
  • the method is for determining a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units.
  • the method comprises receiving PDU data indicative of power usage of each of the PDU outlets over time.
  • the method comprises receiving activity data indicative of one or more activity metrics for each of the electrical equipment units over time.
  • the method comprises determining, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDUs and the electrical equipment units.
  • Determining the associations between the PDU outlets and the electrical equipment units may comprise, for each of the electrical equipment units: estimating a model that describes the activity of the respective electrical equipment unit as a function of the power usage of each of the PDU outlets; and, selecting, based on the estimated model, which of the PDU outlets are associated with the respective electrical equipment unit.
  • determining the associations between the PDUs and the electrical equipment units may comprise, for each of the PDU outlets, estimating a model that describes the power usage of the respective PDU outlet as a function of the activity of each of the electrical equipment units; and, selecting, based on the estimated model, which of electrical equipment units are associated with the respective PDU outlet.
  • the function may be a linear function.
  • the model may be estimated using a linear regression approach.
  • the power usage of each of the PDU outlets may have a respective coefficient associated therewith.
  • Each coefficient may be indicative of a proportion of the activity of the electrical equipment unit that relates to the respective PDU outlet.
  • Estimating the model may comprise estimating a value of each of the coefficients.
  • the respective PDU outlet may be determined to be associated with the respective electrical equipment unit.
  • the respective PDU outlet may be determined to not be associated with the respective electrical equipment unit.
  • the selection step may comprise, for one of the PDU outlets: estimating a further model that describes the activity of the respective electrical equipment unit as a function of the power usage of each of the PDU outlets other than said one PDU outlet; and, determining that said one of the PDU outlets is associated with the respective electrical equipment unit if the estimated activity of the further model is within a prescribed tolerance of the estimated activity of the model.
  • said one of the PDU outlets may be discarded from the model.
  • the selection step may comprise repeating the estimation and determination steps for each of the PDU outlets.
  • the further model may describe the activity of the respective electrical equipment unit as a function of the power usage of those PDU outlets not previously discarded from the model.
  • the steps of estimating the model and selecting which of the PDU outlets are associated with the respective electrical equipment unit may be performed simultaneously.
  • the estimating and selecting steps may be performed using an elastic net algorithm.
  • Determining the associations between the PDU outlets and the electrical equipment units may comprise: calculating a distance metric between the power usage of each PDU outlet and the one or more activity metrics of each electrical equipment unit; and, determining, based on the calculated distance metrics, which of the PDU outlets are associated with each of the respective electrical equipment units.
  • Determining which of the PDU outlets are associated with each of the respective electrical equipment units may comprise solving a linear sum assignment problem to minimise a sum of the distance metrics between associated PDU outlets and electrical equipment units.
  • Each electrical equipment unit may be associated with more than one of the PDU outlets.
  • each electrical equipment unit may be associated with two PDU outlets.
  • the method may comprise analysing the determined electrical wiring configuration against one or more defined constraints to be satisfied by the electrical wiring configuration.
  • the method may comprise outputting an action if the determined electrical wiring configuration does not satisfy each of the constraints.
  • the one or more defined constraints may include a redundancy constraint comprising, for each of the plurality of electrical equipment units, that the respective electrical equipment unit must be connected to at least two of the PDU outlets.
  • the one or more defined constraints may include a redundancy constraint comprising, for each of the plurality of electrical equipment units, that at least two of the PDU outlets to which the respective electrical equipment unit is connected must receive power from different power sources of the electrical power system.
  • the method may comprise identifying one or more of the electrical equipment units that provide critical functionality. In some examples, only the identified one or more electrical equipment units may be analysed against the redundancy constraint.
  • the method may comprise selecting a PDU outlet from one or more available PDU outlets of the plurality of PDU outlets. Outputting the action may comprises providing a recommendation to connect the respective electrical equipment unit to the selected PDU outlet. Selection of the PDU outlet from the plurality of PDU outlets may be based on a physical proximity of the respective electrical equipment unit to the one or more available PDU outlets. Optionally, the selected PDU outlet may be the PDU outlet in closest physical proximity to the respective electrical equipment unit.
  • Selection of the PDU outlet from the plurality of PDU outlets may be based on a respective current loading of each of a plurality of power sources providing power to the plurality of PDU outlets.
  • the selected PDU outlet may be a PDU outlet that receives power from the power source having the lowest current loading.
  • the one or more defined constraints may include each of the PDUs has no more than a defined maximum number of electrical equipment units connected thereto.
  • the one or more defined constraints may include a particular electrical equipment unit must be connected to a PDU in a defined subset of the PDUs.
  • Outputting the action may include generating an alarm, optionally an audio and/or visual alarm. Outputting the action may include outputting a notification to one or more personnel, optionally maintenance and/or system administrative personnel.
  • the method may comprise comparing the determined associations at a current time step against previously determined associations from a previous time step to identify changes in the wiring configuration.
  • the method may comprise, for each identified change, determining whether the change is an unexpected change, and outputting an action if the change is an unexpected change.
  • the action may comprise generating an alert for one or more service or maintenance personnel.
  • the action may comprise triggering a secure erase of the electrical equipment unit relevant to the unexpected change.
  • the action may comprise causing physical access to the electrical equipment unit relevant to the unexpected change to be restricted.
  • the received PDU data may be time series data indicative of power usage of the PDU outlets over a defined historical time period.
  • the received activity data may be time series data indicative of activity metrics of the electrical equipment units over the defined historical time period.
  • the method may comprise, prior to determining the associations between the PDU outlets and the electrical equipment units, realigning the time series data so that samples of the PDU data and activity data relate to the same time steps.
  • the realigning step may comprise up-sampling the PDU data and/or the activity data, and interpolating the up-sampled data to a defined sampling period.
  • the realigning step may comprise down-sampling the interpolated data to desired sampling period with samples of the PDU data and activity data relating to the same time steps.
  • the electrical power system may be a data centre electrical power system.
  • One or more of the electrical equipment units may be server machines.
  • the one or more activity metrics of each electrical equipment unit may include central processing unit (CPU) utilisation of the electrical equipment unit.
  • the one or more activity metrics of each electrical equipment unit may include memory utilisation of the electrical equipment unit.
  • the one or more activity metrics of each electrical equipment unit may include a number of bytes transferred in input/output operations generated by a process of the electrical equipment unit.
  • the one or more activity metrics of each electrical equipment unit may include disk accesses per second.
  • the one or more activity metrics of each electrical equipment unit may include graphics processing unit (GPU) activity of the electrical equipment unit.
  • GPU graphics processing unit
  • a non-transitory, computer-readable storage medium storing instructions thereon that when executed by a processor cause the processor to perform a method as defined above.
  • a controller for monitoring an electrical power system comprising a plurality of power distribution units (PDUs) and a plurality of electrical equipment units to be provided with electrical power.
  • the controller is for determining a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units.
  • the controller comprises one or more processors configured to: receive PDU data indicative of power usage of each of the PDU outlets over time; receive activity data indicative of one or more activity metrics for each of the electrical equipment units over time; and, determine, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDlls and the electrical equipment units.
  • Figure 1 schematically illustrates an electrical power system of a data centre in accordance with an example of the invention.
  • Figure 2 shows the steps of a method for monitoring the electrical power system of Figure 1 , in accordance with an example of the invention.
  • Figure 1 is a schematic illustration of a data centre 10 that is used to house computer systems and associated components.
  • the data centre 10 may be in the form of a building, or a dedicated space within a building, for instance.
  • FIG. 1 schematically illustrates an electrical power system 12 in which electrical power is supplied to systems and components in the data centre 10.
  • the electrical power system 12 includes a plurality of power distribution units (PDlls) 121 in the form of devices that distribute power from an input to a plurality of outlets of each PDU 121 .
  • PDlls are typically used for the distribution of power to equipment such as racks of computers and/or networking equipment in a data centre.
  • the input of each PDU 121 may receive power from any suitable power source 124, e.g. an Uninterruptible Power Supply (UPS), (backup) generator or other utility power source. Different ones of the PDUs 121 may receive power from different power sources 124.
  • UPS Uninterruptible Power Supply
  • Different ones of the PDUs 121 may receive power from different power sources 124.
  • a first set 121a of the PDUs 121 may receive power from a first UPS 124a
  • a second set 121 b of the PDUs 121 may receive power from a second UPS 124b, different from the first UPS 124a.
  • the electrical power system 12 includes a plurality of electrical equipment units or components 122 that need to be provided with electrical power to operate or function.
  • the PDUs 121 may provide power to electrical equipment located in a server room or space 101 of the data centre 10.
  • the electrical equipment units 122 in the server room 101 may primarily include server machines (or, simply, servers) that provide services, e.g. processing or saving/storage services, to various client stations, e.g. computers.
  • the electrical equipment units 122 may also include other server room equipment that requires electrical power, such as peripheral devices or hardware.
  • the PDlls 121 supply electrical power to the servers 122 via physical links 123 therebetween.
  • the links are in the form of electrical wires 123 that each connect an outlet of one of the PDlls 121 to one of the servers 122.
  • each server 122 may be connected to more than one of the PDlls 121. In the context of a data centre, this provides redundancy in the electrical power system as a failure of one PDU does not necessarily mean that operation of an associated server stops, thereby guarding against unplanned downtime of service-critical equipment.
  • the particular wiring configuration of the electrical power system 12 - i.e. which PDU outlets 121 are connected via the wires 123 to which servers 122 - may change relatively frequently over time.
  • servers and associated equipment may be swapped out of commission relatively regularly for maintenance or upgrade, e.g. a particular power line may be shut down for a period.
  • MAC moving, adds, changes
  • operations may be performed to install, relocate and/or upgrade various pieces of electrical equipment such as servers.
  • mappings of electrical wiring between the outlets of the PDUs 121 and servers 122 manually would be expensive, time consuming, and prone to errors. Furthermore, when performed manually, updates to the mapping may be performed relatively infrequently, meaning that a relatively long time may pass between a change in the wiring configuration occurring, and this change being reflected in the records.
  • the electrical power system is described in the context of providing power to equipment in a data centre, it will be appreciated that the described electrical power system may be used in different contexts where PDUs provide electrical power to various electrical equipment units and components, e.g. in a home or office context, at a manufacturing site, etc.
  • Figure 1 also includes a system or controller 14 for monitoring the electrical power system 12.
  • the system 14 is for determining the configuration of the physical wiring or links 123 between the outlets of the PDUs 121 and the servers 122, as will be discussed in greater detail below.
  • the controller 14 includes an input configured to receive data indicative of the operation of the electrical power system 12, for instance data from the PDlls 121 , the servers 122, and/or another source, e.g. a storage device, that stores data indicative of the operation of the electrical power system 12.
  • the controller 14 includes an output that may transmit alerts or control signals based on the determined wiring configuration.
  • the controller 14 may be in the form of, or include, any suitable computing device, for instance one or more functional units or modules implemented on one or more computer processors. Such functional units may be provided by suitable software running on any suitable computing substrate using conventional or customer processors and memory. The one or more functional units may use a common computing substrate (for example, they may run on the same server) or separate substrates, or one or both may themselves be distributed between multiple computing devices.
  • a computer memory may store instructions for performing the methods to be performed by the controller 14, and the processor(s) may execute the stored instructions to perform the methods.
  • the system or controller 14 may be regarded as being part of the electrical power system 12 in different examples.
  • the controller may be located in any suitable location.
  • the controller may be in the vicinity of one or more other components of the electrical power system, e.g. in the server room 101 with the server machines of the data centre 10, or in a different location within the data centre 10.
  • the controller may be remote from other components of the electrical power system, and/or remote from the data centre.
  • the controller may be regarded as one of the electrical equipment units that is supplied with power by the PDlls 121 , and monitors itself as part of the method described below to automatically determine and monitor a wiring topology between the PDU outlets and electrical equipment units.
  • the present invention is advantageous in that it provides for automatic determination and monitoring of a configuration or topology of the physical links or wiring connections between outlets of PDlls of an electrical power system and electrical equipment units to which the PDUs provide electrical power, e.g. server machines in a data centre.
  • the present invention is advantageous in that the automatic monitoring allows for changes in the wiring configuration - which may occur relatively frequently - to be identified in real time, quasi real time, or at any other desired frequency. This means that action in response to identified changes in the configuration may be performed in a timely manner. For instance, in the case of a security breach where a server is disconnected from the power supply, the breach is detected immediately, meaning that action can be taken quickly to contain the breach. As another example, in a case where one or more of the electrical equipment units have unplanned downtime, then the associated PDlls can be identified immediately, and replaced or repaired if necessary, thereby minimising the unplanned downtime.
  • the automatic monitoring of the present invention also provides an inexpensive and accurate determination of the wiring configuration, and removes the risk of errors that occur, and the expense involved, when such tasks are performed manually.
  • the invention achieves these benefits by determining a mapping between outlets of the PDlls and the electrical equipment units, e.g. servers, representing the physical links between the PDU outlets and the servers.
  • the mapping is determined based on analysing the power usage of each of the PDU outlets in conjunction with the (processing) activity of the servers in order to determine correlations or pattens indicative of physical wiring connections between particular ones of the PDU outlets and particular ones of the servers. This is described in greater detail below.
  • the invention uses data that is readily available in order to automatically map the wiring configuration.
  • Figure 2 shows steps of a method 20 performed by the system or controller 14 to determine and monitor a configuration of electrical wiring connections 123 between power outlets of the PDUs 121 and the electrical equipment units 122, e.g. servers and/or other server room equipment.
  • the electrical equipment units 122 e.g. servers and/or other server room equipment.
  • the method 20 involves receiving PDU data indicative of power usage of each of the PDU outlets over time.
  • the PDU data is received at the input of the controller 14.
  • the PDU data may be in the form of time series data indicative of power usage of the PDU outlets over a defined historical time period, i.e. historical time series data in the form of a power consumption signature indicative of temporal power consumption of each PDU outlet.
  • the time series data may be sampled at regular intervals.
  • the PDU data may be received or obtained directly from each of the PDU outlets (or from each of the PDUs 121). Alternatively, the PDU data may be obtained from a central platform that receives and stores power consumption data for each of the PDU outlets.
  • the PDU data may be received by the controller 14 substantially continuously, meaning power consumption data is received in real time or quasi real time, or the PDU data may be received by the controller 14 at regular intervals with data covering a prescribed period of operation.
  • the method 20 involves receiving activity or performance data indicative of one or more activity or performance metrics for each of the electrical equipment units 122, e.g. servers, over time.
  • the activity data is received at the input of the controller 14.
  • the activity data may be in the form of time series data indicative of one or more measures of server activity or performance over a defined historical time period, i.e. historical time series data in the form of a server activity signature indicative of temporal server activity or performance of each server 122.
  • the time series data may be sampled at regular intervals.
  • the activity data may be received or obtained from each server 122 directly, for instance via standard monitoring interfaces commonly available on servers, e.g. VMware, vCenter, Windows Sysinternals, SolarWinds IT monitoring software, HPE OneView, etc. That is, the activity data may be retrieved from each server 122 by connecting to a management server or special API (application programming interface) on each server.
  • the activity data may be received from a (PMS) platform management system for the servers.
  • the activity data may include any suitable data indicative of activity or performance of each server 122 over time.
  • the activity data may include processor usage (central processing unit (CPU) percentage), memory usage, bytes read/written on the disk (e.g. disk access per second), bytes sent/received on a network interface, graphics processing unit (GPU) activity of the server, etc.
  • processor usage central processing unit (CPU) percentage
  • memory usage bytes read/written on the disk (e.g. disk access per second)
  • bytes sent/received on a network interface e.g. disk access per second
  • GPU graphics processing unit
  • the method 20 may optionally involve realigning the received PDU and activity time series data so that samples of the PDU data and activity data relate to the same time steps.
  • a sampling period of received data may not be constant over time. For instance, even if a sampling period should be five seconds, then in practice it may actually be between four and six seconds.
  • Each server may also have a different sampling rate and/or be sampled at different instants, e.g. a first server is sampled at 0, 5, 10, ... seconds, whereas a second server is sampled at 2, 12, 22,... seconds.
  • the received data may therefore be manipulated to have the same sampling period and same sampling instances.
  • This may be performed by interpolation, e.g. linear interpolation, of the received data.
  • the data realignment may involve up-sampling the received PDU data and/or the activity data, and interpolating the up-sampled data to a defined sampling period.
  • the up-sampled, interpolated data may then be down-sampled to a desired sampling period (typically the same sampling period as the original data), e.g. one second, with samples of the PDU data and activity data relating to the same time steps, i.e. the resulting data has sampling instants that are common across the PDU outlets and servers.
  • the down-sampling means that desired data is retained with the remaining data discarded.
  • This data realignment may beneficially allow for more accurate analysis and comparison of PDU data and server data to identify patterns and associations in the following steps.
  • the method 20 involves determining, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units 122 to determine the electrical wiring connection configuration between the outlets of the PDUs 121 and the electrical equipment units 122.
  • the associations may be determined or inferred for each server (or other electrical equipment unit) 122 by estimating a model that describes the activity of the respective server 122 as a function of the power usage of each of the PDU outlets. The estimated model may then be used to determine which of the PDU outlets are associated with the respective server 122. In different examples, a model that describes the power usage of a respective PDU outlet as a function of the activity of the servers 122 may be estimated, and then estimated model may then be used to determine which of the servers 122 are associated with the respective PDU outlet.
  • the activity data e.g. time series
  • a model is then fitted to predict or estimate server activity using the power signature time series of each of the PDUs 121.
  • a 0 may be regarded as an intercept term which represents a baseline level of activity of the server 122 under consideration that is not explained by power consumption of any of the PDU outlets.
  • a step is performed to discard those PDU outlets that are unrelated to the respective server 122 from the model. This may be referred to as a feature selection step.
  • the feature selection step examines or analyses the inferred coefficients in the estimated model and, specifically, the strength of the relationship between the time series for each of the PDU outlets and the server 122. PDU outlets whose inferred coefficients are deemed not to differ significantly from zero are discarded, and the remaining PDU outlets are deemed to be connected to the server 122 under consideration.
  • the feature selection step may be approached in a stepwise manner. For instance, one of the PDU outlets may be considered for removal from the estimated model. A comparison of model metrics with and without said one of the PDU outlets may be performed. For instance, this may involve estimating a further model in the absence of the data associated with the PDU outlet being considered for removal, and comparing the model and further model. If there is no statistically significant degradation of performance of the server 122 under consideration, then it may be assumed that said one PDU outlet is not connected to the server 122, and said one PDU outlet is removed from the model. Otherwise, said one PDU outlet is retained in the model. This process may be repeated for each of the PDU outlets. A linear regression approach may be utilised to perform the feature selection step.
  • time series data may be broken into smaller subsections of data, and then joined together in order to minimise the effect of unusual instances in the data when estimating the model.
  • the steps of estimating a (linear) model and performing feature selection are described as separate steps with feature selection following model estimation, these steps may alternatively be performed simultaneously. In particular, this may be performed using an elastic net regularisation algorithm.
  • the elastic net is a regularised regression method that linearly combines penalties of lasso and ridge methods, also known to the skilled person.
  • the elastic net algorithm is described, for instance, in ‘Regularization and Variable Selection via the Elastic Net’, Zou et al., J. R. Statist. Soc.
  • the lasso (least absolute shrinkage and selection operator) method or algorithm is described, for instance, in ‘Regression shrinkage and selection via the lasso’, Tibshirani, J. R. Statist. Soc. B (1996), 58, No. 1 , pp. 267-288.
  • the ridge regression algorithm is described, for instance, in ‘Ridge Regression: Biased Estimation for Nonorthogonal Problems’, Hoerl et al., Technometrics (1970), Vol. 12, No. 1 , pp. 55-67.
  • the elastic net algorithm is a combination of the lasso model and the ridge regression model.
  • these models aim to fit a linear model between an outcome - in this case, the server time series - and predictors - in this case, the PDU time series - while aiming to minimise the complexity of the resulting model.
  • ‘complexity’ refers to a number of variables used in the model.
  • the lasso model achieves this by discarding predictors by setting the value of their coefficient to zero, while ridge regression achieves this by shrinking the coefficients towards zero.
  • the coefficients can be estimated using coordinate descent, which aims to minimise a loss function that penalises for the complexity of the model.
  • the lasso model could potentially discard a PDU time series that is highly correlated with another of the PDU time series, e.g. where power use is balanced across two PDU outlets. Also, the use of ridge regression in isolation would fail to discard any of the PDU outlets.
  • the elastic net algorithm allows for the combination of these approaches, in particular allowing for irrelevant PDU outlets to be discarded as such, while relevant, but highly correlated, PDU outlets are retained.
  • the model may be a nonlinear model rather than a linear model.
  • a random forest may be used, which can also simultaneously infer or estimate a relationship between a server and the PDU outlets, while discarding extraneous PDU outlets.
  • the method steps to determine the server-PDU associations may for instance be scheduled at regular intervals using the data generated in the intervening timestamps. Such scheduling allows for the repository of server-PDU associations to be maintained and kept up-to-date with no intervention needed from an operator.
  • the step of determining the associations between the PDU outlets and the servers 122 is performed based on calculated distance metrics.
  • the distances between the power usage or consumption time series of each PDU outlet and the activity or performance metric time series of each server 122 is calculated.
  • the calculated distances are measures of similarity, i.e. a correlation between two time series. The greater the distance, the less similar two time series are. On the other hand, lesser distances indicate greater similarity between the time series signals.
  • the distance metrics may be calculated using any suitable method, for instance a mean square error, with a correlation coefficient (e.g. Pearson correlation coefficient, Kendall coefficient, Spearman coefficient, etc.) as a measure of a linear correlation between two sets of data, i.e. two time series.
  • a correlation coefficient e.g. Pearson correlation coefficient, Kendall coefficient, Spearman coefficient, etc.
  • the distances between two time series may be calculated in different ways, such as: the multiplicative inverse of the correlation; using a Matrix Profile algorithm, which is known to the skilled person, and is described for instance in ‘Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios’, Gharghabi et al., 2018 IEEE International Conference on Data Mining, pp.
  • SSIM structural similarity index measure
  • PDU outlets are assigned to servers 122 so that the sum of distances between the two assigned time series, across all of the assignments, is minimised. That is, the sum of all distances between chosen couples/pairs of time series (or other form of the received data) is minimised.
  • This is referred to as the linear sum assignment problem, as is known to the skilled person, and it can be solved, for instance, as described in ‘On Implementing 2D Rectangular Assignment Algorithms’ Crouse, IEEE Transactions on Aerospace and Electronic Systems (2016), Vol. 52, No. 4, pp. 1679-1696.
  • each server machine 122 may be required to have redundant power supplies. As such, multiple PDU outlets may be associated to each server.
  • One hypothesis for the linear sum assignment problem may therefore be that there are two power supplies (PDU outlets) per server, and the problem is solved to minimise the sum of distances based on this constraint or assumption.
  • the resulting/determined assignments or associations are stored in a repository or data store (part of, or separate from, the controller 14).
  • the process of determining assignments or associations may be scheduled to be repeated at regular intervals using the data generated in the intervening timestamps.
  • the set of determined associations together constitute the determined configuration of the wiring connections between the PDU outlets and servers 122.
  • the method 20 may optionally involve analysing the server-PDU associations determined in the previous step. In one example, this may involve analysing the determined electrical wiring configuration against one or more defined constraints to be satisfied by the electrical wiring configuration. These constraints may for instance include that a particular server 122 needs to be located in a specific rack of PDUs 121 , and/or that each server 122 (or a certain subset of servers 122) needs to be connected to at least two different PDUs 121 (for redundancy capability). The constraints may additionally or alternatively include that the PDUs 121 must have at most a predefined number of servers 122 connected thereto, for instance because of power limits of the power source 124 that the PDUs 121 are connected to. Such analysis against constraints may be performed irrespective of how the server-PDU associations are performed in step 203, but may particularly be used in the example in which a model is estimated to determine the associations.
  • analysing the server-PDU associations may involve comparing the set or list of associations determined in step 203 (a current time step) with a list of server- PDU associations determined or generated at a previous iteration or run of the process (at a previous time step).
  • the previous associations may be stored in the data store, and retrieved to perform the comparison.
  • the comparison step aims to identify differences between the current and previous sets of associations to identify changes that have occurred to the wiring configuration.
  • association a first element (association) may be picked from the current list of associations. If this element is in the previous list then the next element of the current list is considered. If the first element is not in the previous list, then it may be determined whether such a change is expected. For instance, a particular server may be tagged prior to the determination of the current list of associations to indicate that it is about to be moved, added, etc. In this case, the change may be regarded as being expected. On the other hand, if no such tag or other information is available, then the change may be regarded as unexpected, and the particular element may be marked as such. This is repeated for each element (i.e. each entry or row of the current list).
  • each element of the new list may be followed by considering each element of the previous list to identify associations that were present previously, but have now disappeared, i.e. they do not appear in the current list. Again, where a change is identified from the previous list to the new list, a check may be performed to determine whether the change is expected, e.g. information is available to indicate that a particular server was about to be removed from the system prior to determining the current list of associations.
  • Such analysis of comparing current and previous association lists may be performed irrespective of how the server-PDU associations are performed in step 203, but may particularly be used in the example in which the associations are determined based on minimising distance metrics between the time series.
  • a timestamped log of previous association lists may be stored for further analysis, e.g. to track how changes in the topology may impact the overall efficiency of the system.
  • the method may optionally involve outputting, via the controller 14, one or more remedial actions based on the analysis of the determined associations performed in step 204.
  • an action may be output if one or more of the constraints are deemed to not be satisfied.
  • an audio and/or visual alarm (or other suitable alarm) may be generated, e.g. in the vicinity of the server room.
  • notifications may be sent to maintenance and/or system administrative personnel, for instance via email, phone notifications, sound or visual indicators in a control room for the data centre 10.
  • an action may be output if one or more unexpected changes are detected (at step 204) in the wiring configuration. For instance, an alert may be sent to a system administrator providing information related to the unexpected change.
  • a possible action may be to trigger a secure erase operation of a server 122 associated with the unexpected change, if it is still accessible via the network.
  • a further action could be to prevent access to the relevant servers, e.g. by automatically locking a door of the server room of the data store 10 in which the servers are located, thereby preventing equipment being removed from the server room.
  • Other actions may also be performed, based on a required security level of the specific data centre under consideration.
  • actions following an alert being sent to a system administrator may be performed only after the system administrator confirms that the alert is not a false positive, for instance. While this may increase a delay to applying security measures, it acts to avoid disruptive server downtime in case of false positives. It will be understood that these actions in response to unexpected changes may particularly be useful in the context of detecting, and acting to contain, security breaches or vandalism in a data centre, such as a malicious individual disconnecting a server from a power line, e.g. unauthorised replacement of servers or theft of servers, in a manner that changes the power topology of the system.
  • the determined associations between the outlets of the PDlls 121 and the servers 122 may be used to ensure that sufficient and necessary redundancy is in place for the servers 122 of the system 12, e.g. for disaster avoidance.
  • the described method may also be used to restore redundancy to each of the servers 122, as required.
  • redundancy refers to the design of a system to duplicate certain components such that failure of one of the components (e.g. such that there is disruption to normal power supply) does not impact on the operation and services of critical IT infrastructure.
  • a redundant power supply may be provided so that in the case of a power outage or failure, servers may continue to operate.
  • servers and/or PDlls in a data centre may be added to, moved or removed from a power distribution system relatively frequently, for instance to perform routine work on computer equipment, such as installations, relocations or upgrades. It can therefore be challenging to ensure that redundancy, e.g. power supply redundancy, is maintained in such an electrical power system.
  • a redundant power supply requirement or constraint may be that critical equipment must be connected to at least two different PDU outlets, and/or that the PDU outlets to which the critical equipment component is connected receive power from different power sources 124.
  • each of the electrical equipment units 122 are server machines. It may be that the operation of each of the servers 122 is critical such that each of the servers 122 need a redundant power supply, i.e. each of the servers 122 need to be connected to at least two of the PDU outlets 121. In different cases, it may be that only some of the servers 122 provide services that are considered to be critical, in which case only that critical subset of servers 122 may be required to have a redundant power supply. In further different cases, the plurality of electrical equipment units may provide a number of different types of equipment (e.g. peripheral devices as well as servers), in which case only a subset of the electrical equipment units may be regarded as being critical and need a redundant power supply.
  • the server-PDU associations at step 204 of the method 20 it may be determined whether any constraints relating to the necessary redundancy of the system 12 are satisfied. This may first involve identifying which of the electrical equipment units are regarded as critical in the sense that they need a redundant power supply. This may be performed via a look up of an equipment inventory repository, for instance. It may be that certain types of electrical equipment units, e.g. servers, are regarded as critical, whereas other types, e.g. peripheral devices, are not.
  • the respective unit For each of the identified critical electrical equipment units, it may first be determined whether the respective unit is connected to at least two different PDU outlets. This ensures that failure of one of the connected PDU outlets does not mean operation of the critical unit is compromised. If the condition that the critical equipment unit is connected to two PDU outlets is satisfied, then it may be determined whether the respective critical equipment unit is linked to at least two different power sources 124a, 124b. That is, the different PDU outlets to which the critical electrical equipment unit is connected may be required to be provided with power from different power sources. For instance, one of connected PDUs 121 may receive power from the first power source 124a, and the other of the connected PDlls 121 may receive power from the second power source 124b. This ensures that failure of one of the power sources 124a, 124b does not mean operation of the critical equipment unit is compromised.
  • action may be taken to restore the required redundancy to the system 12. This may involve the controller 14 identifying a PDU outlet to which the critical unit 122 can be connected to restore redundancy.
  • a list of available PDU outlets may be obtained in the first instance, i.e. a list of PDU outlets not in use (by virtue of already being connected to an electrical equipment unit 121 , for instance). Such a list may be obtained from the wiring configuration of determined associations (from step 203). From the determined associations, it is known which PDU outlets are connected to which electrical equipment units 122 and, as such, which PDU outlets have available outlets not currently in use, i.e. not currently connected to another component.
  • the output action at step 205 may be simply to provide an indication of which critical equipment unit 122 does not satisfy the redundancy requirement, along with the list of available PDU outlets, so that a user or operator can select which of the available PDU outlets to connect to the identified critical equipment unit 122 to restore redundancy.
  • the step of analysing the determined associations against redundancy constraints may further include selecting a particular one (or more) of the available PDU outlets, and then the output action may be to provide a specific recommendation to a user to connect the selected PDU outlet to the identified critical equipment unit 122 to restore redundancy.
  • the identified critical equipment unit 122 along with the list of available PDU outlets, or the specific recommendation, may be provided in any suitable manner. For instance, this could be performed via alerts sent to management software for the system 12, a text message or call to a mobile telephone, or visual alerts in a control room of the data centre 10.
  • the selection of a particular one of the available PDU outlets may be based on a number of different factors, and may be performed to optimise one or more aspects of the wiring configuration and system operation.
  • a physical layout or arrangement of the various components in the data centre 10 may be stored in memory, and may be available to the controller 14.
  • the selection of a particular available PDU outlet may be based on the relative physical proximity of different components of the system 12. For instance, in one example the particular one of the available PDU outlets (or an available outlet of the particular one of the PDlls 121) that is closest to the identified critical unit 122 may be selected, which can assist in maintaining a simple wiring configuration.
  • the particular one of the available PDU outlets that is closest to I adjacent to another (or the other) PDU outlet that is connected to the identified critical unit 122 - but, optionally, which receives power from a different power source 124 - may be selected, again for reasons of configuration simplicity for instance.
  • the selection of a particular one of the available PDU outlets may optionally be based on a loading, at a given time, of different power sources 124 providing power to the PDUs 121.
  • a current loading of different power sources 124 may be obtained in any suitable manner. For instance, the current loading of each power source may be inferred from the determined server-PDU associations.
  • the selection of an available PDU outlet may be made to improve the load balancing in the system 12, e.g. the selected PDU outlet may be part of a PDU 121 that receives power from the power source 124 in the system 12 that has the lowest current loading.
  • the selection of a particular one of the available PDU outlets may be based on a combination of different factors, e.g. according to an optimisation algorithm that optimises across a plurality of different factors.
  • the selected PDU outlet may be identified based on one or more of: maximising the use of adjacent PDU outlets or adjacent PDUs 121 ; a proximity to the electrical equipment unit in question; improved load balancing of the system 12; and, a consideration of the entire power chain for the identified PDU or PDU outlet.

Abstract

The invention relates to monitoring an electrical power system comprising power distribution units (PDUs) and electrical equipment units to be provided with electrical power. The invention determines a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units. The invention comprises receiving PDU data indicative of power usage of each of the PDU outlets over time, and receiving activity data indicative of an activity metric for each of the electrical equipment units over time. The invention comprises determining, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDUs and the electrical equipment units.

Description

DETERMINING AN ELECTRICAL WIRING CONNECTION CONFIGURATION OF AN
ELECTRICAL POWER SYSTEM
TECHNICAL FIELD
The invention relates to monitoring an electrical power system comprising a plurality of power distribution units (PDUs) and a plurality of electrical equipment units to be provided with electrical power. In particular, the invention relates to determining a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units.
BACKGROUND
Electrical power systems control the delivery of electrical power to individual electrical equipment units or users that require such electrical power. For instance, in a data centre electrical power is delivered to individual units of electrical equipment in a server room, e.g. individual server machines, from power distribution units (PDUs). In particular, electrical power is delivered via electrical wiring connections between outlets of the PDUs and the server machines.
It is desirable to have knowledge of the topology or configuration of the electrical wiring linking the PDUs and electrical equipment units, such as servers, i.e. knowledge of which PDU outlets are connected to which electrical equipment units. For instance, this can assist in ensuring that sufficient redundancy is in place for server machines or other equipment that provide critical services, in identifying security breaches, or in understanding the effect of withdrawing or shutting down a particular power line, e.g. for maintenance. Such an electrical wiring configuration can change relatively frequently over time; for instance, when server machines are swapped in and out of service, or when maintenance is to be performed on certain components of the electrical power system.
It is known to perform manual mapping of the topology or configuration of electrical wiring of an electrical power system. That is, the physical wiring links may be inspected manually by service personnel. However, such an approach suffers the drawbacks of being error prone, as well as being relatively slow and expensive to perform. Indeed, a relatively long period of time may elapse between a change in wiring topology occurring and the change being reflected in records, as the records may only being updated during relatively infrequent updates that are performed manually. This can pose issues where knowledge of the wiring configuration may be time sensitive, such as in the context of unplanned server downtime where a set of PDlls need to be replaced.
It is against this background to which the present invention is set.
SUMMARY OF THE INVENTION
According to an aspect of the present invention there is provided a computer-implemented method for monitoring an electrical power system comprising a plurality of power distribution units (PDUs) and a plurality of electrical equipment units to be provided with electrical power. The method is for determining a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units. The method comprises receiving PDU data indicative of power usage of each of the PDU outlets over time. The method comprises receiving activity data indicative of one or more activity metrics for each of the electrical equipment units over time. The method comprises determining, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDUs and the electrical equipment units.
Determining the associations between the PDU outlets and the electrical equipment units may comprise, for each of the electrical equipment units: estimating a model that describes the activity of the respective electrical equipment unit as a function of the power usage of each of the PDU outlets; and, selecting, based on the estimated model, which of the PDU outlets are associated with the respective electrical equipment unit. Alternatively, determining the associations between the PDUs and the electrical equipment units may comprise, for each of the PDU outlets, estimating a model that describes the power usage of the respective PDU outlet as a function of the activity of each of the electrical equipment units; and, selecting, based on the estimated model, which of electrical equipment units are associated with the respective PDU outlet.
The function may be a linear function.
The model may be estimated using a linear regression approach. In the model, the power usage of each of the PDU outlets may have a respective coefficient associated therewith. Each coefficient may be indicative of a proportion of the activity of the electrical equipment unit that relates to the respective PDU outlet. Estimating the model may comprise estimating a value of each of the coefficients.
For each of the PDU outlets, if the estimated value of the respective coefficient is greater than a prescribed threshold value, then the respective PDU outlet may be determined to be associated with the respective electrical equipment unit.
If the estimated value of the respective coefficient is within a prescribed tolerance from zero, then the respective PDU outlet may be determined to not be associated with the respective electrical equipment unit.
The selection step may comprise, for one of the PDU outlets: estimating a further model that describes the activity of the respective electrical equipment unit as a function of the power usage of each of the PDU outlets other than said one PDU outlet; and, determining that said one of the PDU outlets is associated with the respective electrical equipment unit if the estimated activity of the further model is within a prescribed tolerance of the estimated activity of the model.
If the estimated activity of the further model is not within a prescribed tolerance, then said one of the PDU outlets may be discarded from the model.
The selection step may comprise repeating the estimation and determination steps for each of the PDU outlets. The further model may describe the activity of the respective electrical equipment unit as a function of the power usage of those PDU outlets not previously discarded from the model.
The steps of estimating the model and selecting which of the PDU outlets are associated with the respective electrical equipment unit may be performed simultaneously.
The estimating and selecting steps may be performed using an elastic net algorithm.
Determining the associations between the PDU outlets and the electrical equipment units may comprise: calculating a distance metric between the power usage of each PDU outlet and the one or more activity metrics of each electrical equipment unit; and, determining, based on the calculated distance metrics, which of the PDU outlets are associated with each of the respective electrical equipment units.
Determining which of the PDU outlets are associated with each of the respective electrical equipment units may comprise solving a linear sum assignment problem to minimise a sum of the distance metrics between associated PDU outlets and electrical equipment units.
Each electrical equipment unit may be associated with more than one of the PDU outlets. Optionally, each electrical equipment unit may be associated with two PDU outlets.
The method may comprise analysing the determined electrical wiring configuration against one or more defined constraints to be satisfied by the electrical wiring configuration. The method may comprise outputting an action if the determined electrical wiring configuration does not satisfy each of the constraints.
The one or more defined constraints may include a redundancy constraint comprising, for each of the plurality of electrical equipment units, that the respective electrical equipment unit must be connected to at least two of the PDU outlets. The one or more defined constraints may include a redundancy constraint comprising, for each of the plurality of electrical equipment units, that at least two of the PDU outlets to which the respective electrical equipment unit is connected must receive power from different power sources of the electrical power system.
The method may comprise identifying one or more of the electrical equipment units that provide critical functionality. In some examples, only the identified one or more electrical equipment units may be analysed against the redundancy constraint.
For each of the plurality of electrical equipment units, if the respective electrical equipment unit does not satisfy the constraint of being connected to at least two of the PDU outlets then the method may comprise selecting a PDU outlet from one or more available PDU outlets of the plurality of PDU outlets. Outputting the action may comprises providing a recommendation to connect the respective electrical equipment unit to the selected PDU outlet. Selection of the PDU outlet from the plurality of PDU outlets may be based on a physical proximity of the respective electrical equipment unit to the one or more available PDU outlets. Optionally, the selected PDU outlet may be the PDU outlet in closest physical proximity to the respective electrical equipment unit.
Selection of the PDU outlet from the plurality of PDU outlets may be based on a respective current loading of each of a plurality of power sources providing power to the plurality of PDU outlets. Optionally, the selected PDU outlet may be a PDU outlet that receives power from the power source having the lowest current loading.
The one or more defined constraints may include each of the PDUs has no more than a defined maximum number of electrical equipment units connected thereto. The one or more defined constraints may include a particular electrical equipment unit must be connected to a PDU in a defined subset of the PDUs.
Outputting the action may include generating an alarm, optionally an audio and/or visual alarm. Outputting the action may include outputting a notification to one or more personnel, optionally maintenance and/or system administrative personnel.
The method may comprise comparing the determined associations at a current time step against previously determined associations from a previous time step to identify changes in the wiring configuration. The method may comprise, for each identified change, determining whether the change is an unexpected change, and outputting an action if the change is an unexpected change.
The action may comprise generating an alert for one or more service or maintenance personnel. The action may comprise triggering a secure erase of the electrical equipment unit relevant to the unexpected change. The action may comprise causing physical access to the electrical equipment unit relevant to the unexpected change to be restricted.
The received PDU data may be time series data indicative of power usage of the PDU outlets over a defined historical time period. The received activity data may be time series data indicative of activity metrics of the electrical equipment units over the defined historical time period. The method may comprise, prior to determining the associations between the PDU outlets and the electrical equipment units, realigning the time series data so that samples of the PDU data and activity data relate to the same time steps.
The realigning step may comprise up-sampling the PDU data and/or the activity data, and interpolating the up-sampled data to a defined sampling period. The realigning step may comprise down-sampling the interpolated data to desired sampling period with samples of the PDU data and activity data relating to the same time steps.
The electrical power system may be a data centre electrical power system. One or more of the electrical equipment units may be server machines.
The one or more activity metrics of each electrical equipment unit may include central processing unit (CPU) utilisation of the electrical equipment unit. The one or more activity metrics of each electrical equipment unit may include memory utilisation of the electrical equipment unit. The one or more activity metrics of each electrical equipment unit may include a number of bytes transferred in input/output operations generated by a process of the electrical equipment unit. The one or more activity metrics of each electrical equipment unit may include disk accesses per second. The one or more activity metrics of each electrical equipment unit may include graphics processing unit (GPU) activity of the electrical equipment unit.
According to another aspect of the present invention there is provided a non-transitory, computer-readable storage medium storing instructions thereon that when executed by a processor cause the processor to perform a method as defined above.
According to another aspect of the present invention there is provided a controller for monitoring an electrical power system comprising a plurality of power distribution units (PDUs) and a plurality of electrical equipment units to be provided with electrical power. The controller is for determining a configuration of electrical wiring connections between power outlets of the PDUs and the electrical equipment units. The controller comprises one or more processors configured to: receive PDU data indicative of power usage of each of the PDU outlets over time; receive activity data indicative of one or more activity metrics for each of the electrical equipment units over time; and, determine, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDlls and the electrical equipment units.
BRIEF DESCRIPTION OF THE DRAWINGS
Examples of the invention will now be described with reference to the accompanying drawings, in which:
Figure 1 schematically illustrates an electrical power system of a data centre in accordance with an example of the invention; and,
Figure 2 shows the steps of a method for monitoring the electrical power system of Figure 1 , in accordance with an example of the invention.
DETAILED DESCRIPTION
Figure 1 is a schematic illustration of a data centre 10 that is used to house computer systems and associated components. The data centre 10 may be in the form of a building, or a dedicated space within a building, for instance.
Figure 1 schematically illustrates an electrical power system 12 in which electrical power is supplied to systems and components in the data centre 10. The electrical power system 12 includes a plurality of power distribution units (PDlls) 121 in the form of devices that distribute power from an input to a plurality of outlets of each PDU 121 . PDlls are typically used for the distribution of power to equipment such as racks of computers and/or networking equipment in a data centre. The input of each PDU 121 may receive power from any suitable power source 124, e.g. an Uninterruptible Power Supply (UPS), (backup) generator or other utility power source. Different ones of the PDUs 121 may receive power from different power sources 124. For instance, a first set 121a of the PDUs 121 may receive power from a first UPS 124a, and a second set 121 b of the PDUs 121 may receive power from a second UPS 124b, different from the first UPS 124a.
The electrical power system 12 includes a plurality of electrical equipment units or components 122 that need to be provided with electrical power to operate or function. In the described example, the PDUs 121 may provide power to electrical equipment located in a server room or space 101 of the data centre 10. The electrical equipment units 122 in the server room 101 may primarily include server machines (or, simply, servers) that provide services, e.g. processing or saving/storage services, to various client stations, e.g. computers. The electrical equipment units 122 may also include other server room equipment that requires electrical power, such as peripheral devices or hardware.
The PDlls 121 supply electrical power to the servers 122 via physical links 123 therebetween. In particular, the links are in the form of electrical wires 123 that each connect an outlet of one of the PDlls 121 to one of the servers 122. As is illustrated in Figure 1 , each server 122 may be connected to more than one of the PDlls 121. In the context of a data centre, this provides redundancy in the electrical power system as a failure of one PDU does not necessarily mean that operation of an associated server stops, thereby guarding against unplanned downtime of service-critical equipment.
The particular wiring configuration of the electrical power system 12 - i.e. which PDU outlets 121 are connected via the wires 123 to which servers 122 - may change relatively frequently over time. In a data centre, servers and associated equipment may be swapped out of commission relatively regularly for maintenance or upgrade, e.g. a particular power line may be shut down for a period. MAC (moves, adds, changes) operations may be performed to install, relocate and/or upgrade various pieces of electrical equipment such as servers.
To monitor the mapping of electrical wiring between the outlets of the PDUs 121 and servers 122 manually would be expensive, time consuming, and prone to errors. Furthermore, when performed manually, updates to the mapping may be performed relatively infrequently, meaning that a relatively long time may pass between a change in the wiring configuration occurring, and this change being reflected in the records.
Although the electrical power system is described in the context of providing power to equipment in a data centre, it will be appreciated that the described electrical power system may be used in different contexts where PDUs provide electrical power to various electrical equipment units and components, e.g. in a home or office context, at a manufacturing site, etc.
Figure 1 also includes a system or controller 14 for monitoring the electrical power system 12. In particular, the system 14 is for determining the configuration of the physical wiring or links 123 between the outlets of the PDUs 121 and the servers 122, as will be discussed in greater detail below. The controller 14 includes an input configured to receive data indicative of the operation of the electrical power system 12, for instance data from the PDlls 121 , the servers 122, and/or another source, e.g. a storage device, that stores data indicative of the operation of the electrical power system 12. The controller 14 includes an output that may transmit alerts or control signals based on the determined wiring configuration.
The controller 14 may be in the form of, or include, any suitable computing device, for instance one or more functional units or modules implemented on one or more computer processors. Such functional units may be provided by suitable software running on any suitable computing substrate using conventional or customer processors and memory. The one or more functional units may use a common computing substrate (for example, they may run on the same server) or separate substrates, or one or both may themselves be distributed between multiple computing devices. A computer memory may store instructions for performing the methods to be performed by the controller 14, and the processor(s) may execute the stored instructions to perform the methods.
Although indicated as being separate from the electrical power system 12 in the illustrated example, the system or controller 14 may be regarded as being part of the electrical power system 12 in different examples. The controller may be located in any suitable location. For instance, the controller may be in the vicinity of one or more other components of the electrical power system, e.g. in the server room 101 with the server machines of the data centre 10, or in a different location within the data centre 10. Alternatively, the controller may be remote from other components of the electrical power system, and/or remote from the data centre. Indeed, in some examples the controller may be regarded as one of the electrical equipment units that is supplied with power by the PDlls 121 , and monitors itself as part of the method described below to automatically determine and monitor a wiring topology between the PDU outlets and electrical equipment units.
The present invention is advantageous in that it provides for automatic determination and monitoring of a configuration or topology of the physical links or wiring connections between outlets of PDlls of an electrical power system and electrical equipment units to which the PDUs provide electrical power, e.g. server machines in a data centre. In particular, the present invention is advantageous in that the automatic monitoring allows for changes in the wiring configuration - which may occur relatively frequently - to be identified in real time, quasi real time, or at any other desired frequency. This means that action in response to identified changes in the configuration may be performed in a timely manner. For instance, in the case of a security breach where a server is disconnected from the power supply, the breach is detected immediately, meaning that action can be taken quickly to contain the breach. As another example, in a case where one or more of the electrical equipment units have unplanned downtime, then the associated PDlls can be identified immediately, and replaced or repaired if necessary, thereby minimising the unplanned downtime.
The automatic monitoring of the present invention also provides an inexpensive and accurate determination of the wiring configuration, and removes the risk of errors that occur, and the expense involved, when such tasks are performed manually.
The invention achieves these benefits by determining a mapping between outlets of the PDlls and the electrical equipment units, e.g. servers, representing the physical links between the PDU outlets and the servers. In particular, the mapping is determined based on analysing the power usage of each of the PDU outlets in conjunction with the (processing) activity of the servers in order to determine correlations or pattens indicative of physical wiring connections between particular ones of the PDU outlets and particular ones of the servers. This is described in greater detail below. Beneficially, the invention uses data that is readily available in order to automatically map the wiring configuration.
Figure 2 shows steps of a method 20 performed by the system or controller 14 to determine and monitor a configuration of electrical wiring connections 123 between power outlets of the PDUs 121 and the electrical equipment units 122, e.g. servers and/or other server room equipment.
At step 201 , the method 20 involves receiving PDU data indicative of power usage of each of the PDU outlets over time. In particular, the PDU data is received at the input of the controller 14. The PDU data may be in the form of time series data indicative of power usage of the PDU outlets over a defined historical time period, i.e. historical time series data in the form of a power consumption signature indicative of temporal power consumption of each PDU outlet. The time series data may be sampled at regular intervals.
The PDU data may be received or obtained directly from each of the PDU outlets (or from each of the PDUs 121). Alternatively, the PDU data may be obtained from a central platform that receives and stores power consumption data for each of the PDU outlets. The PDU data may be received by the controller 14 substantially continuously, meaning power consumption data is received in real time or quasi real time, or the PDU data may be received by the controller 14 at regular intervals with data covering a prescribed period of operation.
Also at step 201 , the method 20 involves receiving activity or performance data indicative of one or more activity or performance metrics for each of the electrical equipment units 122, e.g. servers, over time. Similarly to the PDU data above, the activity data is received at the input of the controller 14. The activity data may be in the form of time series data indicative of one or more measures of server activity or performance over a defined historical time period, i.e. historical time series data in the form of a server activity signature indicative of temporal server activity or performance of each server 122. The time series data may be sampled at regular intervals.
The activity data may be received or obtained from each server 122 directly, for instance via standard monitoring interfaces commonly available on servers, e.g. VMware, vCenter, Windows Sysinternals, SolarWinds IT monitoring software, HPE OneView, etc. That is, the activity data may be retrieved from each server 122 by connecting to a management server or special API (application programming interface) on each server. The activity data may be received from a (PMS) platform management system for the servers.
The activity data may include any suitable data indicative of activity or performance of each server 122 over time. For instance, the activity data may include processor usage (central processing unit (CPU) percentage), memory usage, bytes read/written on the disk (e.g. disk access per second), bytes sent/received on a network interface, graphics processing unit (GPU) activity of the server, etc.
At step 202, the method 20 may optionally involve realigning the received PDU and activity time series data so that samples of the PDU data and activity data relate to the same time steps. In particular, a sampling period of received data may not be constant over time. For instance, even if a sampling period should be five seconds, then in practice it may actually be between four and six seconds. Each server may also have a different sampling rate and/or be sampled at different instants, e.g. a first server is sampled at 0, 5, 10, ... seconds, whereas a second server is sampled at 2, 12, 22,... seconds. The received data may therefore be manipulated to have the same sampling period and same sampling instances.
This may be performed by interpolation, e.g. linear interpolation, of the received data.
In one example, the data realignment may involve up-sampling the received PDU data and/or the activity data, and interpolating the up-sampled data to a defined sampling period. The up-sampled, interpolated data may then be down-sampled to a desired sampling period (typically the same sampling period as the original data), e.g. one second, with samples of the PDU data and activity data relating to the same time steps, i.e. the resulting data has sampling instants that are common across the PDU outlets and servers. The down-sampling means that desired data is retained with the remaining data discarded. This data realignment may beneficially allow for more accurate analysis and comparison of PDU data and server data to identify patterns and associations in the following steps.
At step 203, the method 20 involves determining, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units 122 to determine the electrical wiring connection configuration between the outlets of the PDUs 121 and the electrical equipment units 122.
In one example, the associations may be determined or inferred for each server (or other electrical equipment unit) 122 by estimating a model that describes the activity of the respective server 122 as a function of the power usage of each of the PDU outlets. The estimated model may then be used to determine which of the PDU outlets are associated with the respective server 122. In different examples, a model that describes the power usage of a respective PDU outlet as a function of the activity of the servers 122 may be estimated, and then estimated model may then be used to determine which of the servers 122 are associated with the respective PDU outlet.
In more detail, consider a first one of the servers 122. The activity data, e.g. time series, for said server 122 is extracted, as well as the power usage data for each of the PDU outlets. A model is then fitted to predict or estimate server activity using the power signature time series of each of the PDUs 121. For instance, the fitted model may be a linear model of the form: s = a0 + GiPi + a2p2 + a3p3 + ••• where s is the server 122 under consideration, Pi, p2,p3, ... are the outlets of the PDUs 121 of the system 12, and alt a2, a3, ... are coefficients representing a proportion of the activity on the server 122 under consideration which relates to the power consumption of the respective PDU outlet. a0 may be regarded as an intercept term which represents a baseline level of activity of the server 122 under consideration that is not explained by power consumption of any of the PDU outlets.
It is assumed to be highly unlikely that increased power consumption at a PDU outlet corresponds to a decrease in server activity. As such, a constraint may be imposed that the coefficients are taken to be non-negative values, i.e.
Figure imgf000015_0001
Once the model has been fitted for the server 122 under consideration, a step is performed to discard those PDU outlets that are unrelated to the respective server 122 from the model. This may be referred to as a feature selection step. In particular, the feature selection step examines or analyses the inferred coefficients in the estimated model and, specifically, the strength of the relationship between the time series for each of the PDU outlets and the server 122. PDU outlets whose inferred coefficients are deemed not to differ significantly from zero are discarded, and the remaining PDU outlets are deemed to be connected to the server 122 under consideration.
The feature selection step may be approached in a stepwise manner. For instance, one of the PDU outlets may be considered for removal from the estimated model. A comparison of model metrics with and without said one of the PDU outlets may be performed. For instance, this may involve estimating a further model in the absence of the data associated with the PDU outlet being considered for removal, and comparing the model and further model. If there is no statistically significant degradation of performance of the server 122 under consideration, then it may be assumed that said one PDU outlet is not connected to the server 122, and said one PDU outlet is removed from the model. Otherwise, said one PDU outlet is retained in the model. This process may be repeated for each of the PDU outlets. A linear regression approach may be utilised to perform the feature selection step.
Furthermore, a bootstrap approach may be used to perform feature selection. In particular, the time series data may be broken into smaller subsections of data, and then joined together in order to minimise the effect of unusual instances in the data when estimating the model.
The above steps are repeated for each one of the servers 122 in turn until it has been inferred which of the PDU outlets are associated with, and therefore connected to, which of the servers 122. Although in the above the steps of estimating a (linear) model and performing feature selection are described as separate steps with feature selection following model estimation, these steps may alternatively be performed simultaneously. In particular, this may be performed using an elastic net regularisation algorithm. As is known to the skilled person, the elastic net is a regularised regression method that linearly combines penalties of lasso and ridge methods, also known to the skilled person. The elastic net algorithm is described, for instance, in ‘Regularization and Variable Selection via the Elastic Net’, Zou et al., J. R. Statist. Soc. B (2005), 67, Part 2, pp. 301-320. The lasso (least absolute shrinkage and selection operator) method or algorithm is described, for instance, in ‘Regression shrinkage and selection via the lasso’, Tibshirani, J. R. Statist. Soc. B (1996), 58, No. 1 , pp. 267-288. The ridge regression algorithm is described, for instance, in ‘Ridge Regression: Biased Estimation for Nonorthogonal Problems’, Hoerl et al., Technometrics (1970), Vol. 12, No. 1 , pp. 55-67.
As mentioned, the elastic net algorithm is a combination of the lasso model and the ridge regression model. In both cases, these models aim to fit a linear model between an outcome - in this case, the server time series - and predictors - in this case, the PDU time series - while aiming to minimise the complexity of the resulting model. In this context, ‘complexity’ refers to a number of variables used in the model. The lasso model achieves this by discarding predictors by setting the value of their coefficient to zero, while ridge regression achieves this by shrinking the coefficients towards zero. In both cases, the coefficients can be estimated using coordinate descent, which aims to minimise a loss function that penalises for the complexity of the model.
Used in isolation, the lasso model could potentially discard a PDU time series that is highly correlated with another of the PDU time series, e.g. where power use is balanced across two PDU outlets. Also, the use of ridge regression in isolation would fail to discard any of the PDU outlets. The elastic net algorithm allows for the combination of these approaches, in particular allowing for irrelevant PDU outlets to be discarded as such, while relevant, but highly correlated, PDU outlets are retained.
In further modifications of the example in which a model is estimated, the model may be a nonlinear model rather than a linear model. For instance, a random forest may be used, which can also simultaneously infer or estimate a relationship between a server and the PDU outlets, while discarding extraneous PDU outlets. Once a model has been estimated for each server, i.e. once the associations between each server and the PDU outlets has been inferred, the determined associations for each sever are updated in the in a repository or other data storage, which may be part of the controller or system 14 or separate therefrom, of server-PDU associations.
The method steps to determine the server-PDU associations may for instance be scheduled at regular intervals using the data generated in the intervening timestamps. Such scheduling allows for the repository of server-PDU associations to be maintained and kept up-to-date with no intervention needed from an operator.
In another example, the step of determining the associations between the PDU outlets and the servers 122 (step 203) is performed based on calculated distance metrics. In particular, the distances between the power usage or consumption time series of each PDU outlet and the activity or performance metric time series of each server 122 is calculated. The calculated distances are measures of similarity, i.e. a correlation between two time series. The greater the distance, the less similar two time series are. On the other hand, lesser distances indicate greater similarity between the time series signals.
The distance metrics may be calculated using any suitable method, for instance a mean square error, with a correlation coefficient (e.g. Pearson correlation coefficient, Kendall coefficient, Spearman coefficient, etc.) as a measure of a linear correlation between two sets of data, i.e. two time series. Indeed, the distances between two time series may be calculated in different ways, such as: the multiplicative inverse of the correlation; using a Matrix Profile algorithm, which is known to the skilled person, and is described for instance in ‘Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios’, Gharghabi et al., 2018 IEEE International Conference on Data Mining, pp. 965-970; computing the structural similarity index measure (SSIM), which is known to the skilled person, and is described for instance in ‘Image Quality Assessment: From Error Visibility to Structural Similarity’, Wang et al., IEEE Transactions on Image Processing (2004), Vol. 13, No. 4, pp. 600-612; dynamic time warping; transform-based similarity methods, including Discrete Fourier Transform (DFT) or Discrete Wavelet Transform (DWT).
Once all of the distances between pairs of time series have been calculated, then PDU outlets are assigned to servers 122 so that the sum of distances between the two assigned time series, across all of the assignments, is minimised. That is, the sum of all distances between chosen couples/pairs of time series (or other form of the received data) is minimised. This is referred to as the linear sum assignment problem, as is known to the skilled person, and it can be solved, for instance, as described in ‘On Implementing 2D Rectangular Assignment Algorithms’ Crouse, IEEE Transactions on Aerospace and Electronic Systems (2016), Vol. 52, No. 4, pp. 1679-1696.
In the present context, each server machine 122 may be required to have redundant power supplies. As such, multiple PDU outlets may be associated to each server. One hypothesis for the linear sum assignment problem may therefore be that there are two power supplies (PDU outlets) per server, and the problem is solved to minimise the sum of distances based on this constraint or assumption. The resulting/determined assignments or associations are stored in a repository or data store (part of, or separate from, the controller 14). Similarly to above, the process of determining assignments or associations may be scheduled to be repeated at regular intervals using the data generated in the intervening timestamps. The set of determined associations together constitute the determined configuration of the wiring connections between the PDU outlets and servers 122.
Returning to Figure 2, at step 204 the method 20 may optionally involve analysing the server-PDU associations determined in the previous step. In one example, this may involve analysing the determined electrical wiring configuration against one or more defined constraints to be satisfied by the electrical wiring configuration. These constraints may for instance include that a particular server 122 needs to be located in a specific rack of PDUs 121 , and/or that each server 122 (or a certain subset of servers 122) needs to be connected to at least two different PDUs 121 (for redundancy capability). The constraints may additionally or alternatively include that the PDUs 121 must have at most a predefined number of servers 122 connected thereto, for instance because of power limits of the power source 124 that the PDUs 121 are connected to. Such analysis against constraints may be performed irrespective of how the server-PDU associations are performed in step 203, but may particularly be used in the example in which a model is estimated to determine the associations.
In another example (which may be performed in addition to, or in isolation from, the constraints analysis), analysing the server-PDU associations may involve comparing the set or list of associations determined in step 203 (a current time step) with a list of server- PDU associations determined or generated at a previous iteration or run of the process (at a previous time step). The previous associations may be stored in the data store, and retrieved to perform the comparison. The comparison step aims to identify differences between the current and previous sets of associations to identify changes that have occurred to the wiring configuration.
One way in which this may be performed is by first identifying associations that are present in the current (new) list, but were not present previously, i.e. not present in the previous list. For instance, a first element (association) may be picked from the current list of associations. If this element is in the previous list then the next element of the current list is considered. If the first element is not in the previous list, then it may be determined whether such a change is expected. For instance, a particular server may be tagged prior to the determination of the current list of associations to indicate that it is about to be moved, added, etc. In this case, the change may be regarded as being expected. On the other hand, if no such tag or other information is available, then the change may be regarded as unexpected, and the particular element may be marked as such. This is repeated for each element (i.e. each entry or row of the current list).
The above steps of considering each element of the new list may be followed by considering each element of the previous list to identify associations that were present previously, but have now disappeared, i.e. they do not appear in the current list. Again, where a change is identified from the previous list to the new list, a check may be performed to determine whether the change is expected, e.g. information is available to indicate that a particular server was about to be removed from the system prior to determining the current list of associations.
Such analysis of comparing current and previous association lists may be performed irrespective of how the server-PDU associations are performed in step 203, but may particularly be used in the example in which the associations are determined based on minimising distance metrics between the time series. In some examples, a timestamped log of previous association lists may be stored for further analysis, e.g. to track how changes in the topology may impact the overall efficiency of the system.
At step 205, the method may optionally involve outputting, via the controller 14, one or more remedial actions based on the analysis of the determined associations performed in step 204. In one example, an action may be output if one or more of the constraints are deemed to not be satisfied. For instance, an audio and/or visual alarm (or other suitable alarm) may be generated, e.g. in the vicinity of the server room. Alternatively, or in addition, notifications may be sent to maintenance and/or system administrative personnel, for instance via email, phone notifications, sound or visual indicators in a control room for the data centre 10.
In another example, an action may be output if one or more unexpected changes are detected (at step 204) in the wiring configuration. For instance, an alert may be sent to a system administrator providing information related to the unexpected change. A possible action may be to trigger a secure erase operation of a server 122 associated with the unexpected change, if it is still accessible via the network. A further action could be to prevent access to the relevant servers, e.g. by automatically locking a door of the server room of the data store 10 in which the servers are located, thereby preventing equipment being removed from the server room. Other actions may also be performed, based on a required security level of the specific data centre under consideration. In relatively low- security cases, actions following an alert being sent to a system administrator (or other relevant personnel) may be performed only after the system administrator confirms that the alert is not a false positive, for instance. While this may increase a delay to applying security measures, it acts to avoid disruptive server downtime in case of false positives. It will be understood that these actions in response to unexpected changes may particularly be useful in the context of detecting, and acting to contain, security breaches or vandalism in a data centre, such as a malicious individual disconnecting a server from a power line, e.g. unauthorised replacement of servers or theft of servers, in a manner that changes the power topology of the system.
As mentioned above, in a specific example the determined associations between the outlets of the PDlls 121 and the servers 122 may be used to ensure that sufficient and necessary redundancy is in place for the servers 122 of the system 12, e.g. for disaster avoidance. The described method may also be used to restore redundancy to each of the servers 122, as required.
In more detail, redundancy refers to the design of a system to duplicate certain components such that failure of one of the components (e.g. such that there is disruption to normal power supply) does not impact on the operation and services of critical IT infrastructure. In the present context, a redundant power supply may be provided so that in the case of a power outage or failure, servers may continue to operate. As mentioned above, servers and/or PDlls in a data centre may be added to, moved or removed from a power distribution system relatively frequently, for instance to perform routine work on computer equipment, such as installations, relocations or upgrades. It can therefore be challenging to ensure that redundancy, e.g. power supply redundancy, is maintained in such an electrical power system.
A redundant power supply requirement or constraint may be that critical equipment must be connected to at least two different PDU outlets, and/or that the PDU outlets to which the critical equipment component is connected receive power from different power sources 124. In the present case, each of the electrical equipment units 122 are server machines. It may be that the operation of each of the servers 122 is critical such that each of the servers 122 need a redundant power supply, i.e. each of the servers 122 need to be connected to at least two of the PDU outlets 121. In different cases, it may be that only some of the servers 122 provide services that are considered to be critical, in which case only that critical subset of servers 122 may be required to have a redundant power supply. In further different cases, the plurality of electrical equipment units may provide a number of different types of equipment (e.g. peripheral devices as well as servers), in which case only a subset of the electrical equipment units may be regarded as being critical and need a redundant power supply.
When analysing the server-PDU associations at step 204 of the method 20, it may be determined whether any constraints relating to the necessary redundancy of the system 12 are satisfied. This may first involve identifying which of the electrical equipment units are regarded as critical in the sense that they need a redundant power supply. This may be performed via a look up of an equipment inventory repository, for instance. It may be that certain types of electrical equipment units, e.g. servers, are regarded as critical, whereas other types, e.g. peripheral devices, are not.
For each of the identified critical electrical equipment units, it may first be determined whether the respective unit is connected to at least two different PDU outlets. This ensures that failure of one of the connected PDU outlets does not mean operation of the critical unit is compromised. If the condition that the critical equipment unit is connected to two PDU outlets is satisfied, then it may be determined whether the respective critical equipment unit is linked to at least two different power sources 124a, 124b. That is, the different PDU outlets to which the critical electrical equipment unit is connected may be required to be provided with power from different power sources. For instance, one of connected PDUs 121 may receive power from the first power source 124a, and the other of the connected PDlls 121 may receive power from the second power source 124b. This ensures that failure of one of the power sources 124a, 124b does not mean operation of the critical equipment unit is compromised.
If it is determined that one of the critical electrical equipment units 122 does not satisfy the redundancy constraint(s), then action may be taken to restore the required redundancy to the system 12. This may involve the controller 14 identifying a PDU outlet to which the critical unit 122 can be connected to restore redundancy. A list of available PDU outlets may be obtained in the first instance, i.e. a list of PDU outlets not in use (by virtue of already being connected to an electrical equipment unit 121 , for instance). Such a list may be obtained from the wiring configuration of determined associations (from step 203). From the determined associations, it is known which PDU outlets are connected to which electrical equipment units 122 and, as such, which PDU outlets have available outlets not currently in use, i.e. not currently connected to another component.
In one example, the output action at step 205 may be simply to provide an indication of which critical equipment unit 122 does not satisfy the redundancy requirement, along with the list of available PDU outlets, so that a user or operator can select which of the available PDU outlets to connect to the identified critical equipment unit 122 to restore redundancy.
Alternatively, the step of analysing the determined associations against redundancy constraints may further include selecting a particular one (or more) of the available PDU outlets, and then the output action may be to provide a specific recommendation to a user to connect the selected PDU outlet to the identified critical equipment unit 122 to restore redundancy.
The identified critical equipment unit 122 along with the list of available PDU outlets, or the specific recommendation, may be provided in any suitable manner. For instance, this could be performed via alerts sent to management software for the system 12, a text message or call to a mobile telephone, or visual alerts in a control room of the data centre 10.
The selection of a particular one of the available PDU outlets may be based on a number of different factors, and may be performed to optimise one or more aspects of the wiring configuration and system operation. A physical layout or arrangement of the various components in the data centre 10 may be stored in memory, and may be available to the controller 14. The selection of a particular available PDU outlet may be based on the relative physical proximity of different components of the system 12. For instance, in one example the particular one of the available PDU outlets (or an available outlet of the particular one of the PDlls 121) that is closest to the identified critical unit 122 may be selected, which can assist in maintaining a simple wiring configuration. In another example, the particular one of the available PDU outlets that is closest to I adjacent to another (or the other) PDU outlet that is connected to the identified critical unit 122 - but, optionally, which receives power from a different power source 124 - may be selected, again for reasons of configuration simplicity for instance.
The selection of a particular one of the available PDU outlets may optionally be based on a loading, at a given time, of different power sources 124 providing power to the PDUs 121. A current loading of different power sources 124 may be obtained in any suitable manner. For instance, the current loading of each power source may be inferred from the determined server-PDU associations. In an example, the selection of an available PDU outlet may be made to improve the load balancing in the system 12, e.g. the selected PDU outlet may be part of a PDU 121 that receives power from the power source 124 in the system 12 that has the lowest current loading.
The selection of a particular one of the available PDU outlets may be based on a combination of different factors, e.g. according to an optimisation algorithm that optimises across a plurality of different factors. For instance, the selected PDU outlet may be identified based on one or more of: maximising the use of adjacent PDU outlets or adjacent PDUs 121 ; a proximity to the electrical equipment unit in question; improved load balancing of the system 12; and, a consideration of the entire power chain for the identified PDU or PDU outlet.
Many modifications may be made to the described examples without departing from the scope of the appended claims.

Claims

1. A computer-implemented method for monitoring an electrical power system comprising a plurality of power distribution units (PDlls) and a plurality of electrical equipment units to be provided with electrical power, the method being for determining a configuration of electrical wiring connections between power outlets of the PDlls and the electrical equipment units, the method comprising: receiving PDU data indicative of power usage of each of the PDU outlets over time; receiving activity data indicative of one or more activity metrics for each of the electrical equipment units over time; and, determining, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDUs and the electrical equipment units.
2. A method according to Claim 1 , wherein determining the associations between the PDU outlets and the electrical equipment units comprises: for each of the electrical equipment units, estimating a model that describes the activity of the respective electrical equipment unit as a function of the power usage of each of the PDU outlets; and, selecting, based on the estimated model, which of the PDU outlets are associated with the respective electrical equipment unit, or for each of the PDU outlets, estimating a model that describes the power usage of the respective PDU outlet as a function of the activity of each of the electrical equipment units; and, selecting, based on the estimated model, which of electrical equipment units are associated with the respective PDU outlet.
3. A method according to Claim 2, wherein the function is a linear function.
4. A method according to Claim 2 or Claim 3, wherein the model is estimated using a linear regression approach.
5. A method according to any of Claims 2 to 4, wherein, in the model, the power usage of each of the PDU outlets has a respective coefficient associated therewith, each coefficient being indicative of a proportion of the activity of the electrical equipment unit that relates to the respective PDU outlet, and wherein estimating the model comprises estimating a value of each of the coefficients.
6. A method according to Claim 5, wherein, for each of the PDU outlets, if the estimated value of the respective coefficient is greater than a prescribed threshold value, then the respective PDU outlet is determined to be associated with the respective electrical equipment unit.
7. A method according to Claim 5 or Claim 6, wherein, if the estimated value of the respective coefficient is within a prescribed tolerance from zero, then the respective PDU outlet is determined to not be associated with the respective electrical equipment unit.
8. A method according to any of Claims 2 to 7, wherein the selection step comprises, for one of the PDU outlets: estimating a further model that describes the activity of the respective electrical equipment unit as a function of the power usage of each of the PDU outlets other than said one PDU outlet; and, determining that said one of the PDU outlets is associated with the respective electrical equipment unit if the estimated activity of the further model is within a prescribed tolerance of the estimated activity of the model.
9. A method according to Claim 8, wherein, if the estimated activity of the further model is not within a prescribed tolerance, then said one of the PDU outlets is discarded from the model.
10. A method according to Claim 9, wherein the selection step comprises repeating the estimation and determination steps for each of the PDU outlets, wherein the further model describes the activity of the respective electrical equipment unit as a function of the power usage of those PDU outlets not previously discarded from the model.
11. A method according to any of Claims 2 to 10, wherein the steps of estimating the model and selecting which of the PDU outlets are associated with the respective electrical equipment unit are performed simultaneously.
12. A method according to Claim 11 , wherein the estimating and selecting steps are performed using an elastic net algorithm.
13. A method according to any previous claim, wherein determining the associations between the PDlls and the electrical equipment units comprises: calculating a distance metric between the power usage of each PDU outlet and the one or more activity metrics of each electrical equipment unit; and, determining, based on the calculated distance metrics, which of the PDU outlets are associated with each of the respective electrical equipment units.
14. A method according to Claim 13, wherein determining which of the PDU outlets are associated with each of the respective electrical equipment units comprises solving a linear sum assignment problem to minimise a sum of the distance metrics between associated PDU outlets and electrical equipment units.
15. A method according to Claim 14, wherein each electrical equipment unit is associated with more than one of the PDUs; optionally wherein each electrical equipment unit is associated with two PDUs.
16. A method according to any previous claim, the method comprising analysing the determined electrical wiring configuration against one or more defined constraints to be satisfied by the electrical wiring configuration, and outputting an action if the determined electrical wiring configuration does not satisfy each of the constraints.
17. A method according to Claim 16, wherein the one or more defined constraints includes a redundancy constraint comprising, for each of the plurality of electrical equipment units, that: the respective electrical equipment unit must be connected to at least two of the PDUs; and, at least two of the PDUs to which the respective electrical equipment unit is connected must receive power from different power sources of the electrical power system.
18. A method according to Claim 17, the method comprising identifying one or more of the electrical equipment units that provide critical functionality, wherein only the identified one or more electrical equipment units are analysed against the redundancy constraint.
19. A method according to Claim 17 or Claim 18, wherein, for each of the plurality of electrical equipment units, if the respective electrical equipment unit does not satisfy the redundancy constraint then the method comprises selecting a PDU outlet from one or more available PDU outlets of the plurality of PDU outlets, and wherein outputting the action comprises providing a recommendation to connect the respective electrical equipment unit to the selected PDU outlet.
20. A method according to Claim 19, wherein selection of the PDU outlet from the plurality of PDU outlets is based on one or more of: a physical proximity of the respective electrical equipment unit to the one or more available PDU outlets; optionally, wherein the selected PDU outlet is the PDU outlet in closest physical proximity to the respective electrical equipment unit; a respective current loading of each of a plurality of power sources providing power to the plurality of PDUs; optionally, wherein the selected PDU outlet is a PDU outlet that receives power from the power source having the lowest current loading.
21. A method according to any of Claims 16 to 20, wherein the one or more defined constraints includes one or more of: each of the PDUs has no more than a defined maximum number of electrical equipment units connected thereto; and, a particular electrical equipment unit must be connected to a PDU in a defined subset of the PDUs.
22. A method according to any of Claims 16 to 21 , wherein outputting the action includes one or more of: generating an alarm, optionally an audio and/or visual alarm; and, outputting a notification to one or more personnel, optionally maintenance and/or system administrative personnel.
23. A method according to any previous claim, the method comprising: comparing the determined associations at a current time step against previously determined associations from a previous time step to identify changes in the wiring configuration; and, for each identified change, determining whether the change is an unexpected change, and outputting an action if the change is an unexpected change.
24. A method according to Claim 23, wherein the action comprises one or more of: generating an alert for one or more service or maintenance personnel; triggering a secure erase of the electrical equipment unit relevant to the unexpected change; and, causing physical access to the electrical equipment unit relevant to the unexpected change to be restricted.
25. A method according to any previous claim, wherein the received PDU data is time series data indicative of power usage of the PDU outlets over a defined historical time period, and wherein the received activity data is time series data indicative of activity metrics of the electrical equipment units over the defined historical time period.
26. A method according to Claim 25, the method comprising, prior to determining the associations between the PDU outlets and the electrical equipment units, realigning the time series data so that samples of the PDU data and activity data relate to the same time steps.
27. A method according to Claim 26, wherein the realigning step comprises: up-sampling the PDU data and/or the activity data, and interpolating the up- sampled data to a defined sampling period; and, down-sampling the interpolated data to desired sampling period with samples of the PDU data and activity data relating to the same time steps.
28. A method according to any previous claim, wherein the electrical power system is a data centre electrical power system, and wherein one or more of the electrical equipment units are server machines.
29. A method according to any previous claim, wherein the one or more activity metrics of each electrical equipment unit include one or more of: central processing unit (CPU) utilisation of the electrical equipment unit; memory utilisation of the electrical equipment unit; a number of bytes transferred in input/output operations generated by a process of the electrical equipment unit; disk accesses per second; and, graphics processing unit (GPU) activity of the electrical equipment unit.
30. A non-transitory, computer-readable storage medium storing instructions thereon that when executed by a processor cause the processor to perform a method according to any previous claim.
31. A controller for monitoring an electrical power system comprising a plurality of power distribution units (PDlls) and a plurality of electrical equipment units to be provided with electrical power, the controller being for determining a configuration of electrical wiring connections between power outlets of the PDlls and the electrical equipment units, the controller comprising one or more processors configured to: receive PDU data indicative of power usage of each of the PDU outlets over time; receive activity data indicative of one or more activity metrics for each of the electrical equipment units over time; and, determine, based on the received PDU data and the received activity data, associations between the PDU outlets and the electrical equipment units to determine the electrical wiring connection configuration between the outlets of the PDUs and the electrical equipment units.
PCT/EP2022/025170 2022-02-18 2022-04-21 Determining an electrical wiring connection configuration of an electrical power system WO2023155968A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202211008705 2022-02-18
IN202211008705 2022-02-18

Publications (1)

Publication Number Publication Date
WO2023155968A1 true WO2023155968A1 (en) 2023-08-24

Family

ID=81750664

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/025170 WO2023155968A1 (en) 2022-02-18 2022-04-21 Determining an electrical wiring connection configuration of an electrical power system

Country Status (1)

Country Link
WO (1) WO2023155968A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005331A1 (en) * 2008-07-07 2010-01-07 Siva Somasundaram Automatic discovery of physical connectivity between power outlets and it equipment
EP3637261A1 (en) * 2018-10-10 2020-04-15 Schneider Electric IT Corporation Systems and methods for automatically generating a data center network mapping for automated alarm consolidation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005331A1 (en) * 2008-07-07 2010-01-07 Siva Somasundaram Automatic discovery of physical connectivity between power outlets and it equipment
EP3637261A1 (en) * 2018-10-10 2020-04-15 Schneider Electric IT Corporation Systems and methods for automatically generating a data center network mapping for automated alarm consolidation

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CROUSE: "On Implementing 2D Rectangular Assignment Algorithms", IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, vol. 52, no. 4, 2016, pages 1679 - 1696, XP011633801, DOI: 10.1109/TAES.2016.140952
GHARGHABI ET AL.: "More Challenging Scenarios", IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2018, pages 965 - 970, XP033485621, DOI: 10.1109/ICDM.2018.00119
HOERL ET AL.: "Ridge Regression: Biased Estimation for Nonorthogonal Problems", TECHNOMETRICS, vol. 12, no. 1, 1970, pages 55 - 67
TIBSHIRANI: "Regression shrinkage and selection via the lasso", J. R. STATIST. SOC. B, vol. 58, no. 1, 1996, pages 267 - 288
WANG ET AL.: "Image Quality Assessment: From Error Visibility to Structural Similarity", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 13, no. 4, 2004, pages 600 - 612, XP011110418, DOI: 10.1109/TIP.2003.819861
ZOU ET AL.: "Regularization and Variable Selection via the Elastic Net", J. R. STATIST. SOC. B, vol. 67, 2005, pages 301 - 320, XP055849418, DOI: 10.1111/j.1467-9868.2005.00503.x

Similar Documents

Publication Publication Date Title
US8595564B2 (en) Artifact-based software failure detection
US9424157B2 (en) Early detection of failing computers
US7716535B2 (en) Kalman filtering for grid computing telemetry and workload management
CN109981333B (en) Operation and maintenance method and operation and maintenance equipment applied to data center
US20160224400A1 (en) Automatic root cause analysis for distributed business transaction
EP2965598A1 (en) Data center intelligent control and optimization
WO2013043170A1 (en) Automated detection of a system anomaly
US20160205127A1 (en) Determining a risk level for server health check processing
US11271794B2 (en) Systems and methods for automatically generating a data center network mapping for automated alarm consolidation
CN111897671A (en) Failure recovery method, computer device, and storage medium
US9489138B1 (en) Method and apparatus for reliable I/O performance anomaly detection in datacenter
US11416321B2 (en) Component failure prediction
CN109976971B (en) Hard disk state monitoring method and device
CN110555150B (en) Data monitoring method, device, equipment and storage medium
CN114398354A (en) Data monitoring method and device, electronic equipment and storage medium
CN112306802A (en) Data acquisition method, device, medium and electronic equipment of system
US10007583B2 (en) Generating a data structure to maintain error and connection information on components and use the data structure to determine an error correction operation
KR102188987B1 (en) Operation method of cloud computing system for zero client device using cloud server having device for managing server and local server
US20080216057A1 (en) Recording medium storing monitoring program, monitoring method, and monitoring system
WO2023155968A1 (en) Determining an electrical wiring connection configuration of an electrical power system
WO2013128468A2 (en) Method and system for efficient real time thermal management of a data center
CN115102838B (en) Emergency processing method and device for server downtime risk and electronic equipment
WO2024067968A1 (en) Monitoring an electrical wiring connection configuration of an electrical power system
US20230023869A1 (en) System and method for providing intelligent assistance using a warranty bot
WO2023084670A1 (en) Monitoring apparatus, monitoring method, and computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22724642

Country of ref document: EP

Kind code of ref document: A1