CA2524550A1 - A system for optimizing server use in a data center - Google Patents

A system for optimizing server use in a data center Download PDF

Info

Publication number
CA2524550A1
CA2524550A1 CA 2524550 CA2524550A CA2524550A1 CA 2524550 A1 CA2524550 A1 CA 2524550A1 CA 2524550 CA2524550 CA 2524550 CA 2524550 A CA2524550 A CA 2524550A CA 2524550 A1 CA2524550 A1 CA 2524550A1
Authority
CA
Canada
Prior art keywords
machine
machines
data
ofx
engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA 2524550
Other languages
French (fr)
Inventor
Daniel Sieroka
Cadman Chui
Peter Fagerstroem
Stephen Pollack
John Stetic
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NetIQ Corp
Original Assignee
PlateSpin Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CA002486103A external-priority patent/CA2486103A1/en
Application filed by PlateSpin Ltd filed Critical PlateSpin Ltd
Priority to CA 2524550 priority Critical patent/CA2524550A1/en
Publication of CA2524550A1 publication Critical patent/CA2524550A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

A system and method for analyzing the usage of servers and their associated applications and providing suggestions on how to best make use of the resources of the servers.
Numerous forms of analysis are utilized to provide suggestions to a user for efficient use of the servers.
Should a suggestion be accepted by a user a conversion between servers is conducted.
Optionally a conversion may be conducted automatically based upon a suggestion.

Description

Attorney Docket No.: SPIN002-04CA

A System for Optimizing Server Use in a Data Center COPYRIGHT NOTICE
[0001] A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION
[0002] Computing systems are growing rapidly in size and complexity. Many businesses have data centers consisting of a multitude of servers. In such an environment servers will have different configurations of hardware and software, including operating systems.
[0003] One of the problems in managing a data center is moving Operating Systems, applications and data between servers and load balancing to provide optimal use of the servers.
[0004] Moving an Operating System, related applications and data from a source server to a target server traditionally requires that all software on the source server be reinstalled on the target server. This is often not trivial. The source server may have legacy applications that cannot be reinstalled. Further the source server may be utilizing a version of an operating system that is not supported on the target server. The source server and target server may also differ in device drivers and connections to peripherals. Typically the individual performing the transfer must have direct contact with the source and target machines to insert media and enter commands.
[0005] Load balancing requires a user to determine which software applications should run on which servers and when. In a large data center this is a complex problem. Load balancing is a constantly moving target as both applications and server configurations change.
The user must be aware of all applications, the amount of resources they require and when the applications use the resources of a server.

Attorney Docket No.: SPIN002-04CA

SUMMARY OF THE INVENTION
[0006] Some embodiments of the present invention are directed to a system for remotely monitoring usage of machines in a data center and suggesting conversions between machines to make efficient use of the resources in the data center, the system comprising a data collection engine and an optimization engine operatively coupled to the data collection engine.
[0007] Some embodiments of the present invention are directed to a method for remotely monitoring usage of machines in a data center to make efficient use of resources in the data center, the method comprising the steps of collecting performance and machine data, analyzing the data, and suggesting conversions between machines.

Attorney Docket No.: SPIN002-04CA

BRIEF DESCRIPTION OF THE DRAWINGS
[0008] For a better understanding of the present invention, and to show more clearly how it may be carried into effect, reference will now be made, by way of example, to the accompanying drawings which aid in understanding an embodiment of the present invention and in which:
[0009] Figure 1 is a block diagram illustrating the conversions between machines;;
[0010] Figure 2 is a block diagram of a system utilizing an embodiment of the present invention;
[0011] Figure 3 is a block diagram of a data center;
[0012] Figure 4 is a block diagram illustrating the interactions between the PowerConvert and OFX modules;
[0013] Figure 5 is a block diagram illustrating machine hierarchy;
[0014] Figures 6a and 6b are a flow chart of the functionality of PowerConvert;
(0015] Figure 7 is a block diagram of the components of an OFX controller;
(0016] Figure 8 is a block diagram of the components of PowerRecon;
[0017] Figure 9 is a flow chart of the functionality of the analysis portion of PowerRecon;
and
[0018] Figure 10 is a block diagram of the components of PowerOptimize.

Attorney Docket No.: SPIN002-04CA

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
[0019] Referring first to Figure 1 a block diagram illustrating the conversions between machines is shown. There are three types of machines, Physical I0, Virtual 12 and Image 14.
Physical machines 10 are servers upon which an operating system and its related software applications run. Virtual machines 12 emulate a specific environment and run on virtual machine server software. For example, some virtual machines 12 may run on a version of Linux, others on versions of Windows. Through the use of virtual server software such as ESX and GSX provided by VMware Inc., and Microsoft Virtual Server (hereinafter referred to as MSVS), multiple virtual machines 12 may be deployed on a physical machine I0. Other virtual servers such as Xen which is open source, Virtual Iron and SW-Soft may also be supported by the embodiments of the present invention. Machine Images 14 are stored copies of the state of a physical machine 10 or a virtual machine 12 at a specific time.
[0020] The conversion from a physical machine 10 (P) to a virtual machine 14 (V) is referred to as P2V. Similarly the conversion from a virtual machine 14 (V) to a machine I S image 12 (I) is referred to as V2I. In general, X is used whenever the source or target is independent of the type of source or target. For example, I2X represents a conversion from a machine image to any other type, physical, virtual or image. In total there are nine possible conversion types as shown in Figure 1. The intent of Figure 1 is to illustrate that any machine may be converted from one to the other utilizing the present invention.
[0021] Referring next to Figure 2, a block diagram of a system utilizing an embodiment of the present invention is shown generally as 20. Data center 22 is where the machines reside and is shown in greater detail in Figure 3. In managing data center 22 a user may wish to move operating systems, applications and data between machines 10, 12, and 14 depending on the addition of new servers, load balancing and disaster recovery. For example, it may be determined that a virtual machine I4 would run more efficiently on a different physical machine 10. Machine images 12 allow for the state of a physical machine 10 or a virtual machine 14 to be backed up and restored as needed. A machine image 12 is stored on an image server, which serves as a host for the image. The only difference between a machine image 12 and a virtual machine 14 is that the machine image 12 may not be started. To start a machine image 12 it must be moved to either a physical machine 10 or a virtual machine 14.

Attorney Docket No.: SPIN002-04CA
[0022] The conversion between machines is directed by PowerConvert 24.
PowerConvert 24 resides on a server and has a distinct URL. Through the use of a Graphical User Interface (GUI) 26, a user may manage the movement of operating systems, applications and data between machines 10, 12 and 14 residing in a network of machines shown as data center 22.
5 PowerConvert 24 obtains information on machines within data center 22 as selected by the user through GUI 26 and allows the user to move operating systems, applications and data between machines. OFX 28 controls and reports on the jobs requested by PowerConvert 24 and client applications 30. OFX 28 resides on a server and has a distinct URL.
In essence OFX is a generic job management engine that remotely executes and monitors jobs through OFX controllers 44 (see Figure 3). Applications can be created through the use of OFX
functionality.
[0023] PowerRecon 32 accesses servers in data center 22 to monitor the servers and collects statistical data on real and virtual machine usage. PowerRecon may be thought of as data collection engine. PowerOptimize 34 utilizes the data gathered by PowerRecon 32 to suggest to a user options for optimizing the use of the servers in data center 22.
PowerOptimize 34 may be thought of as an optimization engine. Should the user wish to have PowerOptimize 34 automatically act upon the options selected, PowerOptimize 34 instructs PowerConvert 24 to perform the optimizations.
[0024] Referring now to Figure 3, a block diagram of a data center 22 is shown. Data center 22 is a repository of various types of machines in various numbers. In the example shown there is a physical machine 10, a virtual server 40 hosting a plurality of virtual machines 12 and a machine image server 42 hosting a plurality of machine images 14. A
virtual server 40 is a computer running virtual server software such as MSVS, ESX, or GSX.
Through the use of virtual server software, multiple virtual machines may exist. Machine image server 42 is a computer that controls the storage of a number of machine images 14.
Generically, a machine container is either a virtual server 40 or a machine imager server 42.
Also, the term contained machine is used when making reference to either a virtual machine 12 or a machine image 14.
[0025] Figure 4 is a block diagram illustrating the interactions between the PowerConvert and OFX modules.

Attorney Docket No.: SPIN002-04CA
[0026] PowerConvert 24 comprises four main components: PowerConvert Business Server 42, Database 54, PowerConvert Web Services Interface 56, and PowerConvert Controller 58. PowerConvert Business server 42 handles requests to convert from a source machine to a target machine. In database 54 it stores archived operations and device driver information. Database 54 contains information of a set of device drivers necessary when converting a machine. Users and client applications 30 communicate with the PowerConvert Business Server 42 through PowerConvert Web Services Interface 56. In one embodiment PowerConvert Web Services Interface 56 utilizes Simple Object Access Protocol (SOAP) over Hypertext Transfer Protocol (HTTP) to provide a standard interface.
PowerConvert Controller 58 is an instance of an OFX controller 44 (see Figure 3), but its role is specialized.
It is responsible for running discovery jobs, and jobs that guide the overall conversion process, which includes deploying other controllers to remote machines, when necessary.
Primarily, it is responsible for handling any requests to PowerConvert 24 that cannot be fulfilled synchronously in the time of a typical http request/response.
IS [0027] OFX 28 controls and reports on the jobs requested by PowerConvert 24 and client applications 30. OFX 28 controls and reports on the jobs requested by PowerConvert 24 and client applications 30. OFX 28 resides on a server and has a distinct URL. In essence OFX 28 is a generic job management engine that remotely executes and monitors jobs through OFX
controllers 44 (see Figure 3). Applications can be created through the use of OFX
functionality. The core of OFX 28 is OFX Business Server 60, which runs jobs through OFX
Controllers 44. OFX Business Server 60 is passive; it is a web server and responds to communication from OFX Web Services Interface 46. In one embodiment OFX Web Services Interface 46 utilizes Simple Object Access Protocol (SOAP) over Hypertext Transfer Protocol (HTTP) to provide a standard interface. OFX Business Server 60 stores all information on requests, the status of requests and machine configuration information in a database 62. In operation OFX Business Server 60 receives information on the status of request from OFX Web Services Interface 46 through OFX controllers 44 (see Figure 3) installed on machines in data center 22.
[0028] PowerConvert 24 is a fully automated solution for OS portability. That is PowerConvert 24 can move the entire contents of a machine, including its operating system, applications and data to another machine. PowerConvert 24 will convert a source machine to Attorney Docket No.: SPIN002-04CA

a target machine. As discussed earlier, the types of source and target machines are Physical (P), Virtual (V), and Image (I). The steps required for each of the nine possible conversion types are illustrated in Table 1 below. Note that the first four rows refer to discovery steps.
Discovery steps are prerequisites to the conversion taking place. If the desired source and target machines cannot be discovered, the conversion will not take place.
[0029] Depending on the source machine and the target machine types used in a conversion, the actual steps used in the conversion process differ. Typically, either a step can be omitted because it is not needed, or a different step needs to be inserted because of the special processing involved for that conversion type.
[0030] There are some prerequisites before the conversion can begin. First, the appropriate source and target machines must be discovered. Next, the user must initiate and configure the parameters that define the conversion process. By default, the target machine will be configured with essentially the same properties as the source machine.
This includes the hostname, amount of RAM, network configuration, number and sizes of disks, and other information. Using GUI 26, the user then modifies the configuration of the target machine to suit their needs. This may include changing the hostname or changing the memory size of the target machine.
[0031] The conversion process is defined in a set of OFX jobs and actions that run on various OFX controllers 44 installed on machines throughout the data center.
[0032] The conversion process is guided by a job running on PowerConvert Controller 58. Each action (or step) in the job is run in sequence. PowerConvert Controller 58 cannot be expected to perform the entire conversion process, since the conversion is almost always distributed among several machines in the data center. Whenever the 'next' step in the conversion process needs to be run on a remote machine (for example, an ESX
server on which a virtual machine will be created), it is the responsibility of the job running on PowerConvert Controller 58 to schedule the appropriate job to run on the appropriate OFX
controller 44 (see Figure 3). It does this by calling OFX 28.
[0033] Table 1 below indicates which steps need to be executed for the given conversion type.

Attorney Docket No.: SPIN002-04CA

Table 1- Discovery and PowerConvert Steps Descri tion P2V V2V I2V P2I V2I I2IP2P V2P I2P

Discover Source x x _ x x x x Discover Source x x x Machine Container Discover Target x x x x x x Machine Container Discover Physical x x x Tar et Install Controller x x x on Source Machine Container if necess Install Controllerx x x x x x on Target machine Container if necess Create VM x x x Take Control of x x x Target Machine Create Volumes x x x x x x on Tar et Take Control of x x x x x x Source Machine Copy Volumes from x x x x x x x x x Source to Tar et Pre are OS to bootx x x x x x Restart Tar et x x x x x x Confi OS in Tar x x x x x x et Restart Source x x x x x x Machine o tional [0034] When PowerConvert 24 has been instructed to perform a conversion from a source machine to a target machine, it needs to provide instructions to and receive status information from those machines. This is done through OFX 28 via OFX Web Services Interface 46.
[0035] A user through the use of GUI 26, or an application through clients 30, opens a discover machine dialog and provides the machine identification such as a hostname or IP
address and their credentials. This results in a job being scheduled on PowerConvert controller 58 to discover the information about the source machine. Once complete the information collected is forwarded to OFX 28 and stored in database 62. The discovery gathers all the necessary information needed for a conversion, as well as some other information that may be useful to the user. The information includes all of the machine's Attorney Docket No.: SPIN002-04CA

components: processors, disks, network adapters, the amount of memory on the machine, details about the operating system, and the network connections.
[0036] If the source machine is running Windows, PowerConvert 24 makes use of WMI
(Windows Management Instrumentation) to remotely query the source machine.
Since not all of the information that PowerConvert 24 needs is available through WMI, some other means to gather information are utilized. For example the physical address Media Access Control (MAC) of each Network Interface Card (NIC) and the properties of each disk volume are queried by deploying a small executable program to the source machine, running it, and copying back the data it generates. In the case of a Linux source machine, PowerConvert 24 communicates with the source machine using a secure protocol such as Secure Shell (SSH).
PowerConvert 24 copies a small executable program to the source machine runs it, and copies back the data it generates. In one embodiment the data is provided as Extended Markup Language (XML).
[0037] Figure 5 is a block diagram illustrating machine object hierarchy, shown generally as 60. Each machine type is defined in an XML schema. To aid the reader in understanding this mapping of machines, Figure 3 describes the physical presence of the machines. Figure 5 describes the hierarchical structure of how machines are described using XML.
Machine 62 is the base type from which all other machine types are derived. Each derived type defines additional properties that are not present in its base type.
[0038] By way of example we illustrate three types of virtual machines 70.
They are Microsoft Virtual Machine 72, VMware ESX Virtual Machine 74 and VMware GSX
Virtual Machine 76. An example of the XML describing a VMware ESX Virtual Machine 74 is provided in Appendix 1.
[0039] In the case of conversion of a machine image 14 as a source machine, a discovery of its machine container occurs. Discovery of a machine container is a matter of determining whether the source machine has a certain property. That is, is it ESX, GSX, MSVS or an Image Server? After determining that the source machine is a machine container, queries are made to determine the properties of each machine container on the source machine. These properties include the version of the application (e.g. ESX v2.5), any special devices that are configured on the machine (e.g. the list of virtual NICs) as well as a list of all of the contained Attorney Docket No.: SPIN002-04CA
machines. If the source machine is a machine image 14 then the server on which it resides must be discovered, since a machine image 14 cannot be started, it cannot be discovered directly.
[0040] In the case of a conversion to a target virtual machine 12 or a machine image 14 a 5 discovery of a machine container occurs. This discovery is the same as the previous discovery step mentioned for conversion of a source machine that is an image.
[0041] If the target machine is a physical machine 10 a discovery is made of the physical machine 10. In one embodiment this may require manual effort by the user, who must boot the machine using a PowerConvert boot CD, since it is expected that the physical machine 10 may be bare. The boot CD contains a copy of Windows Preinstallation Environment (WinPE). The boot CD also contains a Windows application to assist the user to discover and register the machine with PowerConvert 24. The application prompts the user for the URL of PowerConvert 24 and the credentials with which to access it. This results in PowerConvert 24 instructing OFX 28 to create an OFX controller entry in database 62 for that machine. Next, 15 an OFX controller 44 is downloaded from OFX 28 into the WinPE environment and installed and configured. A discovery job is then scheduled to run on this controller.
The discovery job collects information about the physical machine such as: memory size, number of processors, speed of processors, number and sizes of disks including partitions and volumes and all available components, including network adapters and hard disk controllers.
This machine 20 information is stored by OFX 28 in database 62. In the case of Linux a Linux ramdisk is used instead of the boot CD. All other steps remain the same.
[0042] Referring now to Figures 6a and 6b a flow chart of the functionality of PowerConvert 24 is shown generally as 80. Flow chart 80 refers to the steps conducted by PowerConvert 24 should the above mentioned discovery be successful. If discovery is not successful, information on the source and target machine will not be provided to the user via GUI 26 to allow them to initiate conversion.
[0043] Beginning at step 82, an OFX controller 44 is installed on the source machine container, if necessary. When the source machine for a conversion is a machine image 14, PowerConvert 24 manages machine image 14 by running jobs on its host machine image server 42. Thus, an OFX controller 44 must be deployed to the host machine image server 42.

Attorney Docket No.: SPIN002-04CA

If the host machine image server 42 already has an OFX controller 44 installed on it from an earlier conversion, then this step can be skipped.
[0044] Moving now to step 84, if the target machine is a virtual machine 12 or a machine image 14, then PowerConvert 24 manages these machines by running jobs on the host Virtual Machine Server 40 (e.g., ESX, GSX or MSVS) or the Machine Image Server 42.
Thus, an OFX controller 44 must be deployed to the host machine container. If the host machine container already has an OFX controller 44 installed on it from an earlier conversion, then this step can be skipped.
(0045] It is not uncommon for a host machine with an abundance of memory, disk, and CPU resources to be used for multiple purposes. For example, a host machine may be running both VMware GSX and MSVS. In this case, both machine container applications will share the same OFX controller 44 to run their jobs. There is no need to install a new instance of an OFX controller 44 for each machine container on the same host.
[0046] Similarly PowerConvert 24 might be installed on a host that is also running MSVS
and an Image Server. In this case, PowerConvert controller 58, which is actually an instance of an OFX Controller, can be used to run the jobs necessary for the machine containers.
[0047] Next is step 86. This step only needs to run in the case of converting to a virtual machine. When the target machine is a virtual machine, PowerConvert 24 runs a job on the host of the virtual machine server 40 to create and manage a virtual machine 12. Each type of virtual machine server 40 provides its own API that can be used to create and manage one of its virtual machines. The PowerConvert actions running in the jobs make calls to the virtual machine server 40 through the available API's.
[0048] By default, the properties of a virtual machine 12 are set to reflect the properties of the source machine. While configuring the conversion in the PowerConvert GUI
26, the user has the option to adjust many of the properties of the new virtual machine 12 to make optimal use of the resources available. The following properties of a virtual machine 12 may be configured:
a) The display name (as used by the virtual machine server) b) Memory (RAM) Attorney Docket No.: SPIN002-04CA

c) Minimum memory size d) Memory shares e) Number and size of the hard disks f) Hard disk controller types (IDE or SCSI) g) Number of CPUs h) CPU min and max, shares and affinity i) Number of NICs and the mapping to a virtual adapter Finally, the new virtual machine 12 is registered with the virtual server 40.
[0049] We now move to step 88. This step is run only in the case of conversion to a virtual machine. A virtual machine 12 has been created, but it cannot run because there is no operating system installed on the machine.
(0050] The OFX controller 44 on the virtual server 40 is responsible for running this job.
In this job, the newly created virtual machine is modified so that it connects to a virtual CDROM, which contains a copy of the boot image (WinPE or Linux Ramdisk). Then the virtual machine is forced to reboot. When the machine restarts, it will boot from the CDROM.
The boot image will load, and a controller 44 will be installed and configured.
[0051] In this step it is also possible to temporarily modify the memory footprint of the virtual machine for the purposes of running it under the control of PowerConvert 24. For example, it may be suitable for the user to configure the target virtual machine to run with 128MB of RAM, but the overall conversion speed can be improved in some situations if the machine under control is given additional virtual memory to utilize.
[0052] There is no need to take control of a target physical machine during the conversion process, since this already happens during the discovery stage when the machine is booted from the CDROM.
[0053] Moving next to step 90, now that the VM has been created and is under the control of PowerConvert 24, disk partitions and volumes are created.
[0054] Moving next to step 94, PowerConvert 24 takes control of a source machine directly. The job runs in PowerConvert controller 58. The platform specific boot image is Attorney Docket No.: SPIN002-04CA

copied to the source machine. Next, the boot configuration file is backed up and modified to refer to the new image. Finally, the source machine is forced to reboot.
[0055] When the source machine reboots, it will boot from the new boot image.
In one embodiment, the OFX controller 44 is contained in the boot image, so it does not need to be downloaded. Instead, it only needs to be configured. As soon as the machine boots into the boot image, the original boot loader configuration files are restored. This allows the machine to be restored back into its native operating system as soon as it is rebooted, even if it is rebooted in error.
[0056] Moving now to step 96 the source and target machines are ready to begin copying.
If either the source or target machines are physical or virtual machines, they are running under control. That is, they are running within a boot image, with a controller configured. If either the source or target machines is a machine image, then the controller of the machine container's host is used. In any case, there are two controllers ready to handle the copying of files. A 'copy source' job is scheduled to run on the source machines controller, and a 'copy target' job is scheduled to run on the target machine's controller. In the jobs, one side binds to a network port and waits for a connection from the other side. Either the source or the target may be configured to listen on a port. In most cases, it does not matter since the conversion is taking place between two machines that are under control. Once a connection is made, the transfer can begin.
[0057] PowerConvert 24 uses a file-based copy process. The source side begins with the root folder of a given volume and traverses the file system reading each file and folder. As each file and folder is found, the source side writes it to the socket connection. The data is streamed across the network in the OFX Package format. On the target side, the OFX
Package is read from the network connection, one file at a time. As each new file arrives, it is recreated on the target machine with all of its associated properties. The intention is to recreate each file and folder exactly as it was on the source machine. The file transfer continues for each volume that is configured to be copied. The user has the option of choosing not to copy one or more volumes, if so desired. Further, some files are not copied from the source machine to optimize the amount of data to transfer taking into account what can be recreated by the operating system on the target.

Attorney Docket No.: SPIN002-04CA

[0058] As mentioned earlier, PowerConvert 24 uses a file-based copy process.
That is, each individual file and folder is copied from the source to the target. The alternative to this is an image based copy. In an image-based copy, the entire contents of a file system are read from the disk byte-by-byte, regardless of the file system.
[0059] There are several advantages to using a file-based copy instead of an image-based copy, as follows:
1. Resizing of volumes. At configuration time, the user may decide that the size of a volume on the source machine is not optimal for the target machine.
a) For example, the C: drive on a Windows source machine is 20GB in size, and now near capacity. In this case, the corresponding volume on the target machine can be configured with an increased size of, say, SOGB.
b) Similarly, a volume on the source may be underutilized. It may be sized at 120GB, but only ever uses about IOGB. In this case, the corresponding volume on the target machine can be configured with a smaller size of, say, 20GB.
2. Automatic defragmentation of the file system on the target machine. Any file on the source machine may be fragmented. That is, its data is not stored contiguously on the disk.
During PowerConvert's file transfer step, files are being written to the target's disk one file at a time, each file will naturally occupy the next available sectors of the disk, since the disk starts off with a clean file system.
3. Filtering specific files so that they are not copied or are changed during the copy process. Files that can be recreated without copying, such as the swap file for the Windows operating system need not be copied which often save 1GB or more of data.
[0060] Moving now to step 98, PowerConvert 24 prepares the operating system to boot while it is still under control. It is only at this time that PowerConvert 24 has full access to the operating system that has just completed copying from the source. The following steps are taken:
a) Update drivers. Device drivers are installed on the Operating System. The drivers installed are those that match the plug and play identification of the devices on the target machine, which are determined at machine discovery time. For devices such as mass storage Attorney Docket No.: SPIN002-04CA
devices, it is vital to update the drivers while the machine is under control, otherwise the machine may likely never be able to boot.
b) Update Hardware Abstraction Layer (HAL) and kernel files. Hal and kernel files are updated, if necessary.
5 c) Update boot configuration file (boot.ini or grub.conf or linux.conf) so that the new machine will boot from the appropriate partition.
d) Update hostname, as configured by the user.
e) Update network connections. At this time for Linux only; it needs to be done later for Windows.
10 f) Disable VMware tools, if necessary.
g) Disable MSVS additions, if necessary.
h) Update Windows services or Linux daemons, as configured by the user.
[0061] Moving now to step 100, the target virtual machine is restarted. This step runs on the target machine container. Its purpose is to 'undo' the take control. This involves undoing 15 the temporary changes that were needed during the take control step, including the disconnection of the virtual CDROM and resetting the memory size back to the user configured amount. In the case of a physical machine, this step is run from PowerConvert controller 58. It schedules a job to run on the target physical machines OFX
controller 44, and instructs it to reboot.
[0062] Step 102 only needs to run for Windows target machines. For Linux, the target machine is fully configured by the end of the Prepare OS to Boot step 98. This step runs within a small Windows service that is injected into the target earlier and does the following:
a) Restore mount points on volumes b) Configure network connections c) Generate new Session Id d) Join a domain or workgroup, as configured by user e) Restore NT4 file security Attorney Docket No.: SPIN002-04CA

[0063] Step 104 ends the process and is optional. This step brings the source machine out of the PowerConvert boot image and back into the native operating system. The user may want to 'move' a machine, instead of 'cloning' a machine. In this case, they may not want to restart the source machine, and the machine is left 'under control'. If the user does want their source machine restarted, then PowerConvert 24 will relinquish control of that machine by running a 'reboot' job on the controller while the machine is under control.
[0064] To aid the reader in understanding the function of OFX 28 we will now describe some OFX terms.
(0065] A device is a generic term for a physical or virtual device that can be controlled.
Examples of devices would be a computer, a virtual machine, a software application, a network switch or a group of devices. Devices can be nested to form a hierarchy. Information on a device includes a Globally Unique ID (guid), a display name, a security descriptor and Extended Markup Language (XML) instance data. PowerConvert 24 extends the use of this instance data to store its own model of a machine object in XML format as discussed above with reference to Figure 5.
[0066] An OFX job defines a set of actions. Jobs are executed by an OFX
controller 44 and are versioned. Jobs may be scheduled against devices or controllers.
[0067] Actions provide implementation behavior for jobs. Actions allow developers and users to extend the use of OFX 28 with custom behavior for custom solutions and applications. Actions are implemented as dynamic link libraries and are reusable among jobs.
[0068] An OFX package is a binary format that is used for file distribution.
It is similar in notion to a .tar file or a .zip file.
[0069] OFX packages may be used in several ways namely:
1. OFX Action packages. When OFX controller 44 needs to execute an action. To do so it must load a specified .dll. This .dll and any dependent .dll's are archived in a package available for OFX controller 44 to download from OFX 28. In this scenario OFX
controller 44 requires the .dll's with their names and content, but little else.
2. OFX Job packages. These typically contain data files that are needed during the execution of a job. Multiple job packages can be used in a job.

Attorney Docket No.: SPIN002-04CA

3. PowerConvert File Transfer. During a conversion using PowerConvert 24, all of the files from a source machine are copied across a network to a target machine.
For each file transferred all of the file's properties are transferred with it so that the file can be recreated on the target machine exactly as it was on the source machine.
4. PowerConvert Image Server. When archiving machine images on an image server, for each file in the archive, alI of its properties and contents are stored so that it can be recreated at some later time when the machine image is deployed to a target machine.
[0070] The same package format is used for both Windows and Linux and is portable to any operating system format.
[0071] The structure of a package comprises four main components, a Package Preamble, Package Headers, File Headers and Files. We will now describe each in turn.
[0072] A Package Preamble identifies the package version.
[0073] Package Headers consists of a set of zero or more Headers that pertain to the package as a whole. The format of Package Headers is as follows:
Number of Headers four bytes Header 1 variable size Header n Package Headers can be used by the package author to provide additional information or hints about the source or the contents. For example, a package header could provide a hint about the approximate number of files or the estimated size of the package. This gives the program reading the package enough information to estimate progress between 1 % to 100%.
[0074] File Headers have the same format as Package Headers and consist of a set of zero or more Headers. The only difference from Package Headers is that the properties relate to a file, instead of the entire package. File Headers provide a great deal of flexibility as a new File Header can be added whenever the need arises to describe some additional property of a special file type. For example File Header names used in one embodiment may include:
Attributes An integer representation of the file attributes Attorney Docket No.: SPIN002-04CA

CreationTime An integer representation of the file creation time LastWriteTime An integer representation of the last modification time Backup Utilizes Windows backup semantics (Boolean) [0075] A Header can be used for either packages or files. A Header includes a name/value pair that describes a property of either the package or the file.
Each header has a MustUnderstand Boolean flag. When writing a package the author sets this value to TRUE if the reader must understand the meaning of this header. When reading a package the reader must check this flag. If the reader does not understand the meaning of the header, it will fail.
In describing headers and other components we will hereinafter be referring to "Length-prefixed UTF8 string", this is a short form for "Four byte length-prefixed eight bit Unicode transformation format encoded string". One format of a header is as follows:
Name Length-prefixed UTF8 string Value Length-prefixed UTF8 string 1 S MustUnderstand One byte (Boolean) [0076] The Files section consists of a sequence of File entries. The number of files in the package is never specified. Instead there exists a Boolean marker with the value TRUE before the beginning of each new File to indicate that a file follows. A Boolean marker with the value FALSE will follow the last file in the package. With this format the writer of a package does not need to pre-calculate the number of files in the package. This is beneficial during a file transfer between two machines. As each file is read from the disk, it can be immediately sent across the network to the target machine. Thus there is no need to wait until the package is fully assembled. The Files section may be embodied in a format as follows:
MoreFlag One byte (Boolean TRUE) File 1 Variable size MoreFlag One byte (Boolean TRUE) File 2 Variable size MoreFlag One byte (Boolean TRUE) File n Variable size Attorney Docket No.: SPIN002-04CA

End of Piles Marker One byte (Boolean FALSE) [0077] Each File describes the properties and contents of a file or directory.
A File structure is as follows:
Full Name of File Length-prefixed UTF8 string Is Directory One byte (Boolean) File Length in Bytes Eight bytes File Headers Variable Size File Contents Variable Size [0078] The OFX package format is designed to be flexible for the easy addition of extensions. For example, file compression can be added without changing the File format by adding a File Header with the type "compressed", for each file that required compression. On the target machine the package reader will know that the file is in a compressed format when it 1 S sees the "compressed" File Header.
[0079] We will now provide more detail on PowerConvert Web Services Interface 56 and OFX Web Services Interface 46.
[0080] Power Convert Web Services Interface 46 deals primarily with Operations and Machines. To deal with these it provides an Operations Web Service and a Machine Web Service. The Operation Web Service provides a wrapper around the OFX job information, especially with respect to tracking the relationships between all of the remote jobs that make up a single conversion process. GUI 26 calls the Operations Web Service when it wants to check on the status of a conversion, using OperationWebService.GetOperation() The Operation Web Service also includes methods such as AbortOperation and DeleteOperation.
[0081] The Machine Web Service handles all request related to machines, including:
a) GetMachine(), to retrieve the machine properties b) Discover(), which schedules a job on PowerConvert controller 58 to gather information about the specified machine, and add the results to database 62.
c) Undiscover() which will remove the machine from database 62.

Attorney Docket No.: SPIN002-04CA
d) ConvertToMachineContainer(), which schedules a job on PowerConvert controller 58 to convert a source machine to a contained machine (i.e. x2V or x2I).
(0082] OFX Web Services Interface 46 provides access to OFX Business Server 60. Web services are organized into three groups: Installation/Setup, Controllers-Only and Runtime.
5 [0083] Installation/Setup web services are used to configure OFX 28 with jobs, actions and packages. These are primarily used at installation time. Examples of Installation/Setup web services are:
a) JobWebService allows the user to add, delete, modify and query job definitions including actions such as: AddJob, GetJob, GetJobs, DeleteJob and SetJob.
10 b) ActionTypeWebService allows the user to add, delete, modify and query ActionType definitions including actions such as: AddActionType, DeleteActionType, GetActionType and SetActionType.
c) PackageWebService allows the user to add, delete, modify and upload package definitions including actions such as: AddPackage, GetPackage, SetPackage and 15 UploadPackage.
d) SchemaWebService provides a means for defining the structure, content and semantics of XML documents in more detail. This web service is used to add and get schemas and includes actions such as, AddSchema, GetSchema and GetSchemas.
e) ImportExportWebService is used to import jobs, action types, devices, schemas and 20 packages into OFX 28. The input is an XML document that contains the definitions of one or more job, action types and packages. This web service will call each of JobWebService, ActionTypeWebService, DeviceWebService, SchemaWebService and PackageWebService as required.
fJ ConfigurationWebService is used to get and set security descriptor fields in database 62. These include jobs, action types, devices, controllers and packages.
Actions include GetRootSecurityDescriptor and SetRootSecurityDescriptor.

Attorney Docket No.: SPIN002-04CA

[0084] Controller-Only web services are used by OFX controllers 44. There are two types of Controller-Only web services, namely ControllerNotificationWebService and ControllerPackageDownload.
[0085] ControllerNotificationWebService is used by OFX controllers 44 to notify OFX
28 of the controller status and the status of the jobs they are running. The three operations are:
a) Startup, which provides notification to OFX 28 when OFX controller 44 starts.
b) Heartbeat, which is used by an OFX controller 44 to inform OFX 28 that it (the controller) is still running. An OFX controller 44 sends to OFX 28 a snapshot of any jobs that it is currently running. OFX 28 then sends the OFX controller 44 any new jobs that have been scheduled to run on the OFX controller 44. Typically this occurs every five seconds or so.
c) Shutdown, which provides notification to OFX 28 when an OFX controller 44 stops.
[0086] ControllerPackageDownload is used by OFX controllers 44 to download any packages that are required to execute a job. A job is never executed until an OFX controller 44 has successfully downloaded all of the dependent packages. Packages are defined by a guid and a version. When a OFX controller 44downloads a package, the package is cached on disk, in case it is needed later to run another job that has the same package dependency.
(0087] Runtime Web Services are of three types, JobSchedulingWebService, ControllerWebService and DeviceWebService.
[0088] JobSchedulingWebService includes the following services:
a) ScheduleJob, which is used for defining a new job instance, along with input parameters and the device or controller for the job to be executed on.
b) GetScheduledJob, which returns information about the specified job. This includes the job status, the status of each action in the job and any input and output data.
c) GetControllerJobs, which returns all of the jobs related to a specific controller 44.
d) AbortJob, which aborts a running job.
e) DeleteJob, which is used to delete a scheduled job when it is no longer needed on the system.

Attorney Docket No.: SPIN002-04CA

[0089] ControllerWebService includes the following services:
a) AddController, which is used to define and configure a new OFX controller 44 just before the controller is deployed to a machine.
b) SetController, which is used is used for modifying the properties of controller 44.
c) DeleteController, which is used to delete a specified OFX controller 44 entry in the database.
d) GetController, which returns its properties and attributes of a specified controller.
e) GetControllers, which returns properties and attributes of all controllers 44.
f) GetControllerEventLogEntries, which returns the event log entries for OFX
controller 44.
g) ClearControllerEventLogEntries, which clears the event log entries for OFX
controller 44.
[0090] DeviceWebService allows the user to add, delete, modify and query device definitions. It includes services such as: AddDevice, SetDevice, GetDevice, GetDevices, GetDeviceIds, and DeleteDevice.
[0091] An example of a SOAP request and response for the service ControllerWebService.GetController follows. The fields in bold are placeholders that need to be replaced with actual values.
Example - ControllerWebService.Getcontroller POST /ofxweb/Controller.asmx HTTP/1.1 Host: shark.platespin.com Content-Type: text/xml; charset=utf-8 Content-Length : lerigth SOAPAction: "http://schemas.platespin.com/ofx/ws/GetController"
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/xmlSchema-instance" xmlns:xsd="http://www.w3.org/2001/xmlSchema"
xmlns:soap="http://schemasxmlsoap.org/soap/envelope/">
<soap:Body>
<id xmlns="http://schemas.platespin.com/ofx/ws/">string</id>

Attorney Docket No.: SPIN002-04CA

<options xmlns="http://schemas.platespin.com/ofx/ws/">Properties or Data or SecurityDescriptor</options>
</soap:Body>
S </soap:Envelope>
HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8 Content-Length: length <?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/xmlSchema-instance" xmlns:xsd="http://www.w3.org/2001/xmlSchema"
xmlns:soap="http://schemasxmlsoap.org/soap/envelope/">
<soap:Body>
<controller id="string" status="Unknown or Running or Stopped" dateCreated="dateTime" dateModified="dateTims"
bootFile="string" bootFromNetwork="boolean"
bootPlatform="windows or Linux" bootExpected="boolean"
xmlns="http://schemas.platespin.com/ofx/ws/">
<description>string</description>
<data»1</data>
<macAddresses>
<macAddress>string</macAddress>
<macAddress>string</macAddress>
</macAddresses>
<securityDescriptor>base84Binary</securityDescriptor>
</controller>
</soap:Body>
</soap:Envelope>
[0092] Referring now to Figure 7, a block diagram of the components of an OFX
controller 44 is shown. An OFX controller 44 runs on a computer host to control one or more OFX devices, and it executes jobs provided to it by OFX 28. The design of OFX
controllers 44 is independent of platform architecture and uses OFX Web Services Interface 62 to communicate with OFX 28. OFX controllers 44 are generic in that they know nothing about the actions they are executing. All the code needed to execute an action is downloaded on demand from OFX 28.
[0093] An OFX controller 44 for Windows is deployed from OFX 28 by using NetBIOS
and WMI to copy OFX controller 44 to a remote machine and register it as a Windows Attorney Docket No.: SPIN002-04CA

Service. Similarly a OFX controller 44for a Linux machine is deployed using the SSH
protocol. In both cases, administrator credentials are required on the targeted machine.
[0094] At deployment time each OFX controller 44 is configured with:
a) The URL of OFX 28, which the OFX controller 44 uses to contact OFX 28.
S b) A unique guid so that it can be identified by OFX 28.
c) A randomly generated symmetric encryption key for security. This "secret"
along with a nonce value are used as a signature for the purpose of preventing replay attacks.
[0095] As shown in Figure 7, an OFX controller 44 comprises five main components, notification service l I0, job manager 112, scheduler service 114, package manager 116 and job execution processes) 118.
(0096] Notification service 110 regularly checks in with OFX Business Server 60 through OFX Web Services Interface 62 to report status and to determine if any new jobs are waiting.
This checking in is referred to as a "heartbeat" and occurs frequently, typically on the order of every five seconds but is user definable. On each heartbeat OFX controller 44 will send a snapshot of the status of each running job and the latest log file entries.
Log file entries are maintained by the controller to indicate the status of a job. Examples of log file entries are;
job received and package downloaded. A log file provides a running status of the job progression. Notification Service 110 will receive any new jobs that have been scheduled to run on an OFX controller 44 since the last heartbeat. An OFX controller 44 will keep sending a snapshot of a given job at the heartbeat until the job has completed running. The benefit of such a design is that the dataflow across a network is minimized.
[0097] Job manager 112 is responsible for persisting all of the job XML files on disk.
When notification service 110 receive a new job from OFX 28, it forwards that job to job manager 112 which will immediately persist that job XML to disk. Next, job manager 112 will notify scheduler service 114 that a new job has arrived. Also, when notification service 110 is preparing a heartbeat for OFX 28 it will ask job manager 112 for all the jobs that have been modified since the last successful heartbeat. As a job executes the job data will change.
For example the status of an action will change from "NotRun" to "Running"
when the action Attorney Docket No.: SPIN002-04CA
starts. Job manager 112 stores this information in the form of an XML file in job XML folder 120.
[0098] Scheduler service 114 is responsible for the running of each job.
Scheduler service 114 schedules a job for a given time and controls a queue of jobs to be run.
Before a job can 5 be executed any dependent packages must first be downloaded, so package manager 116 is sent a request to download the packages. Each job has one or more associated packages, which contain everything needed to execute a job. The packages required to run a job are specified in the job XML. Scheduler service 114 executes a job by spawning a separate job execution process 118. Scheduler service 114 then waits for the process 118 to exit, at which time it will 10 check the exit status of the process. If the job execution process 118 failed for any reason, scheduler service 114 is responsible for setting the job status to "Failed" by modifying the job XML in job XML folder 120. Finally, scheduler service 114 informs job manager 112 that the job has completed.
[0099] Each job is run in its own process 118 to protect any other running jobs. There is 15 one new process created for each job that needs to be run. There is no limit to the number of jobs that can run concurrently on a single OFX controller 44, except for the usual memory, disk and CPU resource constraints of a machine.
[00100) The job execution process 118 is responsible for running the job. It does this by loading and executing each of the actions specified in the job. The job execution process is 20 also responsible for setting the status of each action as it changes, as well as the overall job status. It is also responsible for flushing the job XML to the Job XML folder 120 at the completion of each action.
[00101) Package manager 116 communicates with an OFX file server 122 to download all packages specified in a job. OFX file server 122 is under the control of OFX
28. Package 25 manager 116 will also store its own cache of packages that have been downloaded previously in packages folder 124. Packages are identified by a name (guid) and a version. Package manager 116 will access cached packages in packages folder 124. As the number of packages to download for a specific job can vary, and their sizes can vary, any request to package manager 116 is asynchronous. Package manager 116 will notify scheduler service 114 when it completes downloading all of the packages for a specific job. Package manager Attorney Docket No.: SPIN002-04CA

automatically handles retries, in the case that the download of a package fails because of some temporary network difficulties.
[00102] To better illustrate the types of information utilized by OFX 28 we will now briefly describe the contents of database 62. In one embodiment database 62 is a relational SQL database containing a plurality of tables. The main tables provide information on:
packages, scheduled jobs, controllers, devices and actions. As an example of the structure of database 62, the tables for scheduled jobs and controllers would be linked together by controller id. The scheduled jobs table would include information about a job, such as a job id, a job version, a device id, status, date scheduled and other fields. The controllers table would include information about a controller such as: id, security descriptor, description, pointers to a bootfile, status, and other fields. It is not the intent of the inventors to restrict the use of the present invention to a specific implementation of database 62 but rather to indicate that it serves as a repository for OFX 28.
[00103] Figure 8 is a block diagram of the components of PowerRecon 32.
PowerRecon 1 S 32 is designed to aid in the consolidation of operating systems, applications and data on servers in data center 22. PowerRecon 32 monitors the servers to collect information and provides detailed plans on how a consolidation may be accomplished.
Information may be collected through the use of Windows services such as WMI or Windows Performance Counters. In the case of Linux commands to collect information may be made through a service such as SSH. A user through the use of GUI 26 may examine the information collected by PowerRecon 32 and select which consolidations should occur.
[00104] PowerRecon 32 comprises two main modules, Web Services Application Programming Interface (API) 130 and Software Developer's Kit (SDK) 132 which communicate with each other. Web Services API 130 comprises three modules, inventory 134, performance 136 and analysis 138.
[00105] In one embodiment there are five groups of methods provided by Web Services 130. They are:
1) Autonomic. These provide analysis and optimization web services.

Attorney Docket No.: SPIN002-04CA
27 2) Inventory. These provide inventory gathering, machine and machine container information, group information, and security credentials. Machines contain containers, groups can contain other groups and machines.
3) Nodes. These provide hierarchical definitions of the groups, machines and containers.
4) Performance Data Collection. These provide methods for the starting and stopping of data collection and retrieval of performance data.
5) Reports. These report on actions that the system is performing. They return information of the state and progress of a task such as running an analysis, running inventory or running optimization.
[00106] Examples of methods in Web Services API 130 used by inventory 134 include:
a) Import. This imports information about a machine by returning the id of the machine as stored in the PowerOptimize database 146.
b) Inventory. This discovers a set of servers of the same platform type such as all ESX
machines.
[00107] Examples of methods in Web Services API 130 used by performance 136 include:
a) Start. This starts performance monitoring on a machine.
b) Stop. This stops performance monitoring on a machine.
c) GetMetricData. This gets the metric data collected for a machine.
d) GetMonitoredMachines. This provides a list of machines being monitored for performance metrics.
[00108] Examples of methods in WebServices API 130 used by analysis I38 include:
a) AnalyseMachine. This starts an analysis on a single machine to determine if it is within a set of thresholds to determine if it is overused, underused or within a target range.
b) StartAnalysis. This performs the same tests as AnalyseMachine but does it on a group of machines.
[00109] Inventory 134, through the use of OFX controllers described earlier in reference to OFX 28 collects information on a server. Information to be collected includes detailed information about machines and containers. The information collected is stored in Attorney Docket No.: SPIN002-04CA
28 PowerOptimize database 146. This information may be refreshed at the request of the user or automatically on a regular schedule.
[00110] Performance 136 examines the data collected in PowerRecon database 148. This data includes information on the performance data that needs to be collected:
such as: disk I/O, CPU usage, CPU pages NIC Megabits per Second (Mbps) and memory usage.
PowerRecon database 148 also stores historical performance data that has been collected. As a part of the Web Services API 130, it returns the data requested from database 148.
[00111] Analysis 138 examines the information collected by inventory 134 and performance 136 stored in databases 146 and 148 to make suggestions to optimize the performance of the servers in data center 22. Figure 9 illustrates the logical steps of analysis 138.
[00112] PowerRecon SDK 132 comprises two main modules, database gatherer 140 and realtime gatherer 142. When a request is received to obtain information on a server in data center 22 from Web Services API 130, PowerRecon SDK 132 utilizes a Simple Performance Inspector (SPI) interface 144 to obtain information on the server. An SPI 144 may be created to act as an adapter for any source of performance data. An SPI 144 runs on a server and is configured to the server platform to utilize platform specific tools.
Information may be collected using a variety of methods such as SSH for Linux, Windows Performance Counters, or tools provided by products such as VMware to best collect the data. In addition, multiple SPI's mitigate losses of data.
[00113] Database gatherer 140 extracts information from a server and returns it to Web Services API 130 to be stored in database 148. Realtime gatherer 142 collects information on a server in realtime and returns it to Web Services API 130 to be displayed to a user via GUI
26.
[00114] Figure 9 is a flow chart of the functionality of the analysis portion (138) of PowerRecon 32 and is shown generally as 150. Beginning at step 152 databases 146 and 148 are queried for inventory and performance information on each server submitted for analysis.
At step 154 a possibility test is made. For example are there enough resources available on a target machine to move other machines to it? If three machines each require 1 GB of RAM
and the target machine has only 2GB of RAM than such a transfer is not possible. Moving to Attorney Docket No.: SPIN002-04CA
29 step 156 an intelligence test is performed. This step verifies that the usage of each resource (e.g. Disk I/O and CPU) for the machines to be transferred does not conflict.
For example if two machines have intensive disk I/O usage, it would be useful to have them on servers with low I/O usage. In another example if two machines have high CPU usage, from 3:OOPM to 6:OOPM Monday to Friday it would be efficient to have them on separate servers. Conversely if one machine had a high CPU usage from 8:00 to 12:00 from Monday to Friday and another from 12:00 to 4:00, they would be good candidates to reside on the same server. Any number of analyses may be used to determine a good fit for machines on servers based upon resource use such as: mean, trend analysis, standard deviation, or min/max values. At this step a suggestion may also be made of the size of a new virtual machine. For example if a machine was configured for 2GB of RAM, but only used 256MB most of the time, a suggestion may be made to reduce the RAM of the target machine to 512MB and perhaps increase the paging file size. In another scenario, if the server is a heavily used single processor machine, a suggestion could be made to move to a two CPU machine.
[00115] Moving to step 158 the solutions are generated and stored in database 148 so that the user may view and change them via GUI 26. Any changes made by the user are verified by steps 152 and 156. If the user selects a solution, it is then passed to PowerConvert 24 for execution.
[00116] Moving to step 160 the user may request after a certain amount of time to return to step 152 to generate another analysis for the new configuration to determine if it is functioning as expected.
[00117] Referring now to Figure 10 a block diagram of the components of PowerOptimize are shown. PowerOptimize 34 is directed toward load balancing, right sizing and self healing of data centers. Through the use of PowerRecon 32 it can utilize both historical and real time data to provide suggested changes. PowerOptimize 34 comprises five main logic engines namely: expert systems 170, bin packing 172, neural networks 174, fuzzy logic 176 and user options 178. Suggestions 180 coordinates the information collected by each engine and interacts with analysis 138 and GUI 26 to allow a user to select an action to be taken.
Alternatively, should the user wish an action to be taken automatically, suggestions 180 will instruct PowerConvert 24 to make the suggested changes.

Attorney Docket No.: SPIN002-04CA
(00118] Expert systems 170 is based upon facts, rules, actions and an inference engine.
The number of facts grows each time the inference engine runs. The rules define what should be done with the facts. The facts are a collection of known elements such as:
machines, metric data and, thresholds. Examples of metric data would include: processor utilization by CPU, 5 memory usage, disk space, bytes read and written from and to a disk, and bytes sent and received for a NIC. Thresholds are values above or below which a decision can be made.
[00119] The rules define how the system acts based upon the facts. For example:
a) If a CPU is overused, then add virtual machines on a server to handle the overuse.
b) If a server has room then convert from a physical to a virtual machine, i.e. does it have 10 enough disk space, memory, CPU and NIC to support the physical machine as a virtual machine.
c) If the number of virtual machines on a server is beyond a threshold (e.g.
four) then no new virtual machines may be added.
Rule c) is an example of a constraint, which may restrict movements of machines or images.
1 S The inference engine process the rules based upon: preferences (e.g. CPU
usage should be maximized), constraints (e.g. maximum of 4 VM machines on a server), cost and priorities and parameter assignment. Parameter assignment appends a new value to the facts.
[00120] Expert systems 170 conducts an analysis within the context of a set of machines (the machine pool), comprised of source and target candidates. The logic as follows describes 20 a rule referred to as "the zone". The zone is the middle part of the low and high thresholds a user may define for a machine's usage. For example if the low threshold is 30, and the high threshold is 70, the user had identified that they would like performance for a component such as a CPU or disk to be within the zone between 30 and 70. This can be defined per machine, per component.
25 a) Target machines may be any container and empty physical machines.
b) The heuristic for this rule is straightforward; for every metric by which the performance of a machine is evaluated, there is a desired range. The goal is to have all physical machines within the zone.

Attorney Docket No.: SPIN002-04CA

c) Within the desired range is a "sweet spot" used to rank machines and prioritize alternate solution scenarios.
d) If a physical machine in the pool is in the zone, it shall not be moved.
e) If a virtual machine in the pool is in the zone, it may be converted to another virtual server to enable more efficient use of virtual machines on virtual servers.
fJ If all the target candidates in the machine pool are in the zone, no conversions are possible.
f) If all source machines are in the zone, no conversions are suggested.
g) Physical machines that are below the range should be moved to a virtual machine server, thereby freeing up the physical machine. P2V conversions are prioritized in such a way as to maximize the number of virtual machines on a virtual machine server.
h) In the absence of P2V candidates, virtual machines are migrated between underutilized virtual machine servers. The priority is to have the virtual machine server with the least powerful hardware in the zone first. The rational is that by leaving the most powerful available, there will be greater flexibility for future analyses.
i) Physical servers that are above the zone should be left where they are as there is not a solution. If more powerful hardware exists for a conversion target then a possible solution is P2V or P2P conversion.
[00121] For decision making, performance statistics for CPU usage will be averaged over all instances. For example, percent usage will be averaged over both processors in a dual CPU
server. NIC usage will be considered per NIC, since NIC's can be on separate networks. Disk usage is considered on an individual basis. To get disk usage in the zone may require a conversion with disk resizing. Memory is targeted for the zone based on peak usage.
[00122] An example algorithm to select machines for conversion would include the steps:
1) Loop once through the machine pool to select candidate target machines and candidate source machines.
2) If the list of target candidates is empty, return.
3) If the list of source candidates is empty, return.

Attorney Docket No.: SPIN002-04CA

Based upon the candidates selected we propose two solutions. The first solution is a best fit packing of virtual machines into virtual machine servers. In the list of source candidates include the virtual machines already on virtual machine servers. Then loop through all permutations of source and target machines to find the best layout. The second solution is a variation on the first solution. Virtual machines are left on the virtual machine servers where they currently reside if the server is below the zone. Then search permutations of the remaining source and target machines to find the best layout.
[00123] The above solutions can be implemented simply by defining a set of comparison operators and methods, namely:
a) Vmserver.CanFit(machine a) : True or False depending on whether a virtual machine server has enough of the correct resources to support "machine a" as a virtual machine.
b) Vmserver.NotFull() : True if the virtual machine server can host more virtual machines.
c) machine.BiggerThan(machine a) : True or False depending on whether "machine a"
has a bigger footprint than the current machine. A footprint is the collective usage of all components on a machine. It is an n-dimensional representation of the machine's usage of physical hardware and performance data.
d) machine.LessThan(machine a) : True or False depending on whether "machine a" has a smaller footprint than the current machine.
[00124] Bin packing 172 attempts to optimally assign different types of resources, for example, disk space, memory, or CPU usage to a specific machine to determine if a fit can be made for transfernng machines.
[00125] The general problem is figuring out the best way to pack objects into containers.
The problems can be described as the set of NP-complete algorithms known as multi dimensional bin packing and mufti dimensional knapsack problem. A number of algorithms may be pursued to solve this problem, some examples are:
a) First Fit (FF). In this scenario, objects arnve in an unsorted list and are packed into the first bin they fit in.

Attorney Docket No.: SPIN002-04CA

b) First Fit Decreasing (FFD). In this scenario, objects are sorted in decreasing order and then FF is applied.
c) Best Fit (BF). In this scenario, objects arrive in an unsorted list and are packed into the bin that would leave the least amount of space.
d) Best Fit Decreasing (BFD). In this scenario, the objects are ordered in decreasing order and BF is applied.
e) Next Fit (NF). In this scenario, objects arrive in an unsorted list and packed into the first bin they fit into. Subsequent objects are packed in the bin starting after the bin where the last object was packed.
f) Next Fit Decreasing (NFD). In this scenario, objects are sorted in decreasing order and NF is applied.
g) Worst Fit (WF). In this scenario, the objects are unsorted and packed in order into the bin that would leave the most amount of space.
h) Worst Fit Decreasing (WFD). In this scenario, the objects are sorted in decreasing order and WF is applied.
i) Permutation Pack (PP). In this scenario, the objects are ordered in increasing order.
An object is packed into a container where the object has an inverse resource distribution. For example where the object has memory usage > disk usage > cpu usage, then fmd a container where memory usage < disk usage < cpu usage.
j) Custom 1 (C1). In this scenario, the objects are ordered in increasing order and the containers are ordered in increasing order. Then the objects are packed in a round robin manner.
k) Custom 2 (C2). In this scenario, the objects are ordered in increasing order and the containers are ordered in increasing order. Objects axe packed in to a container based upon the mean value of their size. For example if a container is to hold four items, find four objects whose size is roughly the mean divided by four. Find the mean/4 object in the sorted list and take the subsequent items from either side of it.

Attorney Docket No.: SPIN002-04CA

[00126] Neural networks 174 is a learning model. A neural network contains nodes, which commonly have two inputs and an output. The output is either on or off. The outputs are provided again to other nodes in the network or to other networks. The weights of the connections between nodes changes as the system learns from data passed through the system.
As the network, or any other logic engine makes suggestions on what to do, the user may make a decision on what is actually the best course of action from their perspective. This information would then be fed back to the neural network 174 to make it learn.
[00127] Fuzzy logic 176 uses human readable rules to make decisions. Rather than using thresholds such as "memory usage > 70%", fuzzy rules may be applied, such as:
if memory is above average and memory is increasing then add more memory. In the use of fuzzy logic the rules are not precise but descriptive of the goal to be achieved. Membership functions drive the fuzzy logic engine. It describes the importance of each input as a set of fuzzy numbers.
Based upon these fuzzy numbers from the membership, a concrete value is returned. Fuzzy logic 176 could be used to monitor and change resources on a server which are easily modified on a running server. For example without stopping a virtual machine from running, memory and paging could be adjusted as needed. For certain virtual servers, virtual machines may be moved onto faster servers.
[00128] User options 178 allows the user to enter their own criteria for suggesting machine conversions. For example a user could enter values for fuzzy logic fields such as:
a) CPU VERY OVERUSED, CPU_OVERUSED, CPU OPTIMAL, CPU_LJNDERUSED, CPU VERY UNDERUSED, for each virtual and physical machine b) VM_SHARES UNDERALLOCATED, VM_SHARES_OPTIMAL, VM_SHARES OVERALLOCATED, for virtual machines, globally or for all virtual machines on a virtual server.
c) SHORT TIME, SOME TIME, LONG TIME, to define time scales for collecting and assessing data globally.
[00129] Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above.

Attorney Docket No.: SPIN002-04CA
Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. By way of example note that the inventors refer to the use of Windows and Linux environments, specific VM products and specific tools such as WinPE
and SSH. One skilled in the art will recognize that the present invention is structured to be 5 portable across operating systems and easily adaptable to different computing environments and other virtual machine technology.

Attorney Docket No.: SPIN002-04CA

Appendix 1- Machine XML Sample <?xrnl version="1.0" encoding="utf-8" ?>
<VMwareESXVirtualMachine xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://schemas.platespin.com/athens/ws/">
<productVersion>Version5_1</productVersion>
<productVersionAtCreation>Version5 1</productVersionAtCreation>
<lastUpdateTime>2005-10-05T10:04:50.0619494-04:00</lastUpdateTime>
<id>e6c2ef66-b9b9-4c63-bfl6-6a57c66ace44</id>
<manufacturer>VMware, Inc.</manufacturer>
<model>VMware Virtual Platform</model>
<smbiosUUID>Oed94d56-dcc2-83a3-404a-6602a5041d94</smbiosUUID>
<serialNumber>VMware-56 4d d9 Oe c2 do a3 83-40 4a 66 02 a5 04 1d 94</serialNumber>
<operatingSystem xsi:type="MicrosoftOperatingSystem">
<productVersion>Version5_1</productVersion>
<productVersionAtCreation>Version5_1</productVersionAtCreation>
<type>Windows2000</type>
<hostName>vm3.platespin.com</hostName>
<address>192.168.80.102</address>
<version>5Ø2195</version>
<networkConnections>
<networkConnection>
<name>LOCal Area Connection</name>
<ipAddresses>
<ipAddress>
<address>192.168.80.102</address>
<subnetMask>255.255.255.0</subnetMask>
</ipAddress>
</ipAddresses>
<macAddress>00-OC-29-04-1D-94</macAddress>
<dhcpEnabled>false</dhcpEnabled>
<DefaultGateways>192.168.80.1</DefaultGateways>
<DnsServers>192.168.220.10</DnsServers>
<DnsServers>10.10.220.10</DnsServers>
<WinsServers>192.168.50.66</WinsServers>
<WinsServers>127Ø0.0</winsServers>
<peerDns>false</peerDns>
</networkConnection>
</networkConnections>
<volumes>
<volume>

Attorney Docket No.: SPIN002-04CA

<fileSystem>NTFS</fileSystem>
<size>3133796352</size>
<freeSpace>1624932352</freeSpace>
<serialNumber>8062f759</serialNumber>
<label />
<mountPoints>C:</mountPoints>
<partitions>\disk0\partition0\</partitions>
<isDynamic>false</isDynamic>
<isCompressed>false</isCompressed>
</volume>
<volume>
<fileSystem>NTFS</fileSystem>
<size>535805440</size>
<freeSpace>530402816</freeSpace>
<serialNumber>581d5335</serialNumber>
<label>Compressed</label>
<mountPoints>D:</mountPoints>
<partitions>\diskl\partition0\</partitions>
<isDynamic>false</isDynamic>
<isCompressed>true</isCompressed>
</volume>
</volumes>
<installedPrograms>
<installedProgram>
<displayName>WebFldrs</displayName>
<size>2556</size>
<version>9.50.7522</version>
<category>SystemComponent</category>
</installedProgram>
</installedPrograms>
<acpiSupported>true</acpiSupported>
<domain>PSACME</domain>
<servicePack>4.0</servicePack>
<windowsDirectory>C:\winnt</windowsDirectory>
<description />
<pageFiles>
<pageFile>
<location>C:\pagefile.sys</location>
<maxSize>805306368</maxSize>
</pageFile>
</pageFiles>
<hardwareProfile>1</hardwareProfile>
<languageType>English</languageType>
<systemFileList>
<systemFilelnfo>
<fileName>ntoskrnl.exe</fileName>
<internalName>ntoskrnl.exe</internalName>

Attorney Docket No.: SPIN002-04CA

<companyName>Microsoft Corporation</companyName>
<productVersion>5.00.2195.6717</productVersion>
<languageType>English</languageType>
</systemFilelnfo>
<systemFileInfo>
<fileName>ntkrnlpa.exe</fileName>
<internalName>ntkrnlpa.exe</internalName>
<companyName>Microsoft Corporation</companyName>
<productVersion>5.00.2195.6717</productVersion>
<languageType>English</languageType>
</systemFileInfo>
<systemFileInfo>
<fileName>hal.dll</fileName>
<internalName>halaacpi.dll</internalName>
<companyName>Microsoft Corporation</companyName>
<productVersion>5.00.2195.6691</productVersion>
<languageType>English</languageType>
</systemFileInfo>
</systemFileList>
<windowsServices>
<windowsService>
<name>Abiosdsk</name>
<description />
<displayName>Abiosdsk</displayName>
<status>Stopped</status>
<startMode>Disabled</startMode>
<type>KernelDriver</type>
<pathToExecutable />
</windowsService>
<windowsService>
<name>abp480n5</name>
<description />
<displayName>abp480n5</displayName>
<status>Stopped</status>
<startMode>Disabled</startMode>
<type>KernelDriver</type>
<pathTOExecutable />
</windowsService>
- <!-- removed some for clarity -->
</windowsServices>
<controlSet>1</controlSet>
</operatingSystem>
<memory>268435456</memory>
<status>Running</status>
<components>
<component xsi:type="NetworkAdapter">

Attorney Docket No.: SPIN002-04CA

<manufacturer>Advanced Micro Devices (AMD)</manufacturer>
<model>AMD PCNET Family PCI Ethernet Adapter</model>
<deviceld>0</deviceld>
<pnpId>PCI\VEN-1022&DEV 2000&SUBSYS 20001022&REV 10\
S 3&61AAA01&0&88</pnpld>
<macAddress>00-OC-29-04-1D-94</macAddress>
</component>
<component xsi:type="DiskDrive">
<manufacturer>(Standard disk drives)</manufacturer>
<model>VMware Virtual disk SCSI Disk Device</model>
<deviceId>\\.\PHYSICALDRIVEO</deviceId>
<pnpId>SCSI\DISK&VEN VMWARE&PROD VIRTUAL DISK&REV 1.0\
4&SFCAAFC&0&000</pnpId>
<size>3142056960</size>
<type>SCSI</type>
<partitions>
<partition>
<name>\disk0\partition0\</name>
<size>3133799424</size>
<startingOffset>32256</startingOffset>
<active>true</active>
<partitionType>7</partitionType>
<primary>true</primary>
</partition>
</partitions>
</component>
<component xsi:type="DiskDrive">
<manufacturer>(Standard disk drives)</manufacturer>
<model>VMware Virtual disk SCSI Disk Device</model>
<deviceId>\\.\PHYSICALDRIVE1</deviceId>
<pnpId>SCSI\DISK&VEN VMWARE&PROD VIRTUAL DISK&REV_1.0\
4&SFCAAFC&0&010</pnpId>
<size>536870912</size>
<type>SCSI</type>
<partitions>
<partition>
<name>\diskl\partition0\</name>
<size>535805952</size>
<startingOffset>16384</startingOffset>
<active>false</active>
<partitionType>7</partitionType>
<primary>true</primary>
</partition>
</partitions>
</component>
<component xsi:type="Processor">
<manufacturer>Genuinelntel</manufacturer>

Attorney Docket No.: SPIN002-04CA
<model>Intel(R) Xeon(TM) CPU 3.06GHz</model>
<deviceld>CPUO</deviceld>
<speed>3059</speed>
</component>
5 <component xsi:type="ScsiRaidController">
<manufacturer>BusLogic</manufacturer>
<model>BusLogic MultiMaster PCI SCSI Host Adapter</model>
<deviceId>PCI\VEN_104B&DEV 1040&SUBSYS_1040104B&REV 01\
3&61AAA01&0&80</deviceId>
10 <pnpId>PCI\VEN 104B&DEV 1040&SUBSYS-1040104B&REV 01\
3&61AAA01&0&80</pnpId>
<driverName>buslogic</driverName>
</component>
</components>
15 <role>None</role>
<PlateSpinDiscovered>true</PlateSpinDiscovered>
<operatingSystemType>Windows2000</operatingSystemType>
<numberOfCpus>1</numberOfCpus>
<cpuMin>1</cpuMin>
20 <cpuMax>1</cpuMax>
</VMwareESXVirtualMachine>

Claims (24)

WE CLAIM:
1. A system for remotely monitoring usage of machines in a data center and suggesting conversions between machines to make efficient use of resources in said data center, said system comprising:
a) a data collection engine; and b) an optimization engine operatively coupled to said data collection engine.
2. The system of claim 1 wherein said system is operatively coupled to a machine conversion engine for the purpose of executing a conversion remotely and automatically.
3. The system of claim 1 further comprising a machine conversion engine and a job management engine operatively coupled to each of said data collection engine and said optimization engine.
4. The system of claim 1 wherein said data collection engine comprises means for collecting data and storing it in a database and means for collecting data in real time and presenting it to a user via a graphical interface.
5. The system of claim 1 wherein said data collection engine comprises a web services interface, said interface comprising an inventory module, a performance module and an analysis module.
6. The system of claim 5 wherein said inventory module comprises means for collecting information on the configuration of said machines and storing the same in a database.
7. The system of claim 5 wherein said performance module comprises means for extracting performance information from a database of information collected by said data collection engine and means for providing it to a requestor.
8. The system of claim 5 wherein said analysis module comprises means for examining possible conversions between machines and means for determining the viability of a conversion.
9. The optimization engine of claim 1 wherein said optimization engine utilizes one or more analysis modules to provide suggestions on converting machines.
10. The optimization engine of claim 9 wherein said analysis module is an expert system.
11. The optimization engine of claim 9 wherein said analysis module comprises means for utilizing bin packing.
12. The optimization engine of claim 9 wherein said analysis module is a neural network.
13. The optimization engine of claim 9 wherein said analysis module comprises means for utilizing fuzzy logic.
14. The optimization engine of claim 9 wherein said analysis module comprises means for accepting analysis parameters from a user.
15. A method for remotely monitoring usage of machines in a data center to make efficient use of resources in the data center, the method comprising the steps of:
collecting performance and machine data;
analyzing the data; and suggesting conversions between machines.
16. The method of claim 15 further comprising the step of executing a conversion remotely and automatically.
17. The method of claim 15 further comprising the step of storing said data in a database and presenting it to a user via a graphical interface.
18. The method of claim 15 further comprising the step of collecting said data in real time and presenting it to a user via a graphical interface.
19. The method of claim 15 wherein said analyzing utilizes an expert system.
20. The method of claim 15 wherein said analyzing utilizes bin packing.
21. The method of claim 15 wherein said analyzing utilizes a neural network.
22. The method of claim 15 wherein said analyzing utilizes fuzzy logic.
23. The method of claim 15 further comprising the step of accepting analysis parameters from a user.
24. A computer readable medium, said medium comprising instructions for remotely monitoring usage of machines in a data center to make efficient use of resources in the data center, said instructions implementing the steps of:
collecting performance and machine data;
analyzing the data; and suggesting conversions between machines.
CA 2524550 2004-10-26 2005-10-25 A system for optimizing server use in a data center Abandoned CA2524550A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA 2524550 CA2524550A1 (en) 2004-10-26 2005-10-25 A system for optimizing server use in a data center

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CA2486103 2004-10-26
CA002486103A CA2486103A1 (en) 2004-10-26 2004-10-26 System and method for autonomic optimization of physical and virtual resource use in a data center
CA 2524550 CA2524550A1 (en) 2004-10-26 2005-10-25 A system for optimizing server use in a data center

Publications (1)

Publication Number Publication Date
CA2524550A1 true CA2524550A1 (en) 2006-04-26

Family

ID=36242698

Family Applications (1)

Application Number Title Priority Date Filing Date
CA 2524550 Abandoned CA2524550A1 (en) 2004-10-26 2005-10-25 A system for optimizing server use in a data center

Country Status (1)

Country Link
CA (1) CA2524550A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013178394A1 (en) * 2012-05-31 2013-12-05 Alcatel Lucent Load distributor, intra-cluster resource manager, inter-cluster resource manager, apparatus for processing base band signals, method and computer program for distributing load

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013178394A1 (en) * 2012-05-31 2013-12-05 Alcatel Lucent Load distributor, intra-cluster resource manager, inter-cluster resource manager, apparatus for processing base band signals, method and computer program for distributing load

Similar Documents

Publication Publication Date Title
US20060107087A1 (en) System for optimizing server use in a data center
US10733041B2 (en) System, method and computer program product for providing status information during execution of a process to manage resource state enforcement
US10042628B2 (en) Automated upgrade system for a service-based distributed computer system
JP5089990B2 (en) Method and system for a grid-enabled virtual machine with movable objects
US8549106B2 (en) Leveraging remote server pools for client applications
AU777093B2 (en) Pre-defined hardware and software bundle ready for database applications
US8347263B1 (en) Repository including installation metadata for executable applications
EP2176747B1 (en) Unified provisioning of physical and virtual disk images
US7203774B1 (en) Bus specific device enumeration system and method
US20200034167A1 (en) Automatic application migration across virtualization environments
US8214809B2 (en) Grid-enabled ANT compatible with both stand-alone and grid-based computing systems
EP1880321A2 (en) Fast and reliable synchronization of file system directories
JP5375972B2 (en) Distributed file system, data selection method thereof, and program
US20080033902A1 (en) A Method for Providing Live File Transfer Between Machines
US10225142B2 (en) Method and system for communication between a management-server and remote host systems
US20190034464A1 (en) Methods and systems that collect data from computing facilities and export a specified portion of the collected data for remote processing and analysis
US20200034484A1 (en) User-defined analysis of distributed metadata
Grimshaw et al. Architectural support for extensibility and autonomy in wide-area distributed object systems
US11620310B1 (en) Cross-organization and cross-cloud automated data pipelines
CA2524550A1 (en) A system for optimizing server use in a data center
US20210067599A1 (en) Cloud resource marketplace
CA2524549C (en) A system for conversion between physical machines, virtual machines and machine images
Byun et al. DynaGrid: A dynamic service deployment and resource migration framework for WSRF-compliant applications
US11797497B2 (en) Bundle creation and distribution
Tan et al. Shell over a cluster (SHOC): towards achieving single system image via the shell

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead

Effective date: 20170712